Four datasets were used for synteny analyses: 1) the reference Esox lucius transcriptome (see ‘RNA-seq and assembly’ section of Methods); 2) stickleback protein sequences from the Feb. 2006 Broad/gasAcu1 release; 3) zebrafish protein sequences from the Jul. 2010 Zv9/danRer7 release and; 4) medaka protein sequences from the Oct. 2005 NIG/UT MEDAKA1/oryLat 2 release. Stickleback, zebrafish and medaka sequences and their associated genomic location information were obtained from the UCSC Genome Browser [92] (link). Scaffold locations for northern pike transcripts were obtained through mapping using GMAP [93] (link); linkage group assignments followed if the host scaffold had been previously mapped to a group in the genetic map. Using the BLASTX and TBLASTN programs, BLAST alignments (E-value ≤1e-5) were obtained between the northern pike transcripts and the proteins of each other fish species. Orthology between northern pike transcripts and other fish protein sequences was determined using the reciprocal best hit (RBH) paradigm requiring at least 50% of each sequence was covered in non-overlapping BLAST alignments (HSPs) from the other. Synteny between two species (Figure 6) and scaffold continuity (Figure S2) were examined by plotting the genomic locations of each sequence in a relevant orthologue pair.
The analysis of synteny between northern pike and Atlantic salmon (Table 2) was performed by obtaining the flanking sequence of chromosome-associated SNPs in Atlantic salmon and identifying the strongest BLASTN hits (E-value ≤1e-10) between these sequences and northern pike scaffolds with a known linkage group.
Free full text: Click here