We resolved the retroCNV insertion sites using a previously developed pipeline [17 (link)]. Briefly, we aligned WGS fastq files to the EquCab3.0 reference assembly [35 (link)] using Minimap2 v2.24 with the preset ‘-ax sr’ for Illumina paired end reads [36 (link)]. Aligned data were sorted and duplicate reads were removed using samtools [37 (link)]. We then used TEBreak to obtain discordant read clusters at putative retroCNV parent gene loci [38 (link)]. We visually confirmed all retroCNV insertion sites using IGV [39 (link)]. The TEBreak 5’ and 3’ junction sequences for the retroCNVs are available in S2 Table. To validate a set of retroCNV insertion sites and TSD, we designed three primer genotyping PCR assays as previously described [26 (link)] (S3 Table). For genotyping, we randomly selected thoroughbred horses from a DNA repository maintained by the Bannasch lab. These DNA samples were collected. We performed sanger sequencing on an Applied Biosystems 3500 Genetic Analyzer using a Big Dye Terminator Sequencing Kit (Life Technologies, Burlington, ON, Canada). We also analyzed the horse Y chromosome assembly eMSYv3.1 (GenBank accession MH341179) [40 ] for evidence of retroCNVs that had been predicted to be on the Y chromosome based on sex bias. The retroCNV parent gene sequence was used to query the Y chromosome for the retrocopy using BLAST [41 ].
Free full text: Click here