DNA-sequencing data per individual were quality filtered and adapters removed with Trimmomatic V0.36 (Bolger et al. 2014 (link)) in PE mode with the settings adapterfile:2:30:12:8:true MINLEN:30. Reads were mapped against the Nile tilapia (Oreochromis niloticus) genome assembly version 2 (RefSeq accession number GCF_001858045.1_ASM185804v2), which was the only cichlid genome assembly on chromosomal level available to us at the time. Prior to mapping, unplaced scaffolds of this genome assembly were concatenated lexicographically into an “UNPLACED” super chromosome. This customized reference was indexed with BWA V0.7.13 and individual DNA reads were aligned against it with bwa-mem under default parameters (Li and Durbin 2009 (link)). Alignments were coordinate-sorted and indexed with SAMtools 1.3.1 (Li et al. 2009 (link)). Variants were called with GATK’s V3.7 (McKenna et al. 2010 (link)) HaplotypeCaller (per individual and per chromosome), GenotypeGVCFs (per chromosome) and CatVariants (to merge all obtained VCF files). The final variants were filtered with GATK's VariantFiltration with settings “QD < 2.0”, “FS > 200.0”, “ReadPosRankSum < −20.0”, “SOR > 10.0”, “DP < 200” and “DP > 4,000” for indels and “MQ < 40.0”, “FS > 60.0”, “QD < 2.0”, “DP < 200”, “DP > 4,000”, ‘SOR > 7.5”, “MQRankSum < −12.5”, and “ReadPosRankSum < −10.0” for SNPs.
Free full text: Click here