We tested Twisst on two published genomic datasets from Neurospora spp. (ascomycete fungi) and Heliconius spp. (butterflies), selected to represent different sampling strategies (four and five taxa, respectively), as well as different levels of evolutionary complexity. The Neurospora dataset (Corcoran et al. 2016 (link)) consisted of 22 aligned haploid genome sequences from Neurospora tetrasperma samples (10 of mating type A and 12 of mating type a), along with single genomes representing two related species: Neurospora crassa and Neurospora hispaniola. Whole genome alignments were obtained from http://datadryad.org/resource/doi:10.5061/dryad.162mh. We used Lineage-10 (UK) samples of N. tetrasperma, as these had been shown to carry a strong signal of introgression from N. hispaniola (Corcoran et al. 2016 (link)). Trees were constructed for sliding windows of 50 SNPs using BIONJ as described above, with the requirement that each sample had to be genotyped at ≥40 of the 50 SNPs per window. Topology weightings were computed using Twisst, with four defined taxa: N. tetrasperma mat a (12 sequences), N. tetrasperma mat A (10 sequences), N. crassa (one sequence), and N. hispaniola (one sequence).
The Heliconius dataset consisted of 18 resequenced genomes (or 36 haploid genomes) from Martin et al. (2013 (link)). These samples comprised five populations: two geographically isolated races of Heliconius melpomene, from Panama (H. m. rosina, n = 4) and Peru (H. m. amaryllis, n = 4), and their respective sympatric relatives Heliconius cydno chioneus from Panama (n = 4) and Heliconius timareta thelxinoe from Peru (n = 4), with which they are known to hybridize; along with two additional samples of the more distant silvanifrom clade to serve as outgroups. We limited our analysis to two chromosomes: 18, which carries the gene optix, known to be associated with red wing pattern variation; and 21, the Z sex chromosome, which has been shown to experience reduced gene flow between these species, probably due to genetic incompatibilities (Martin et al. 2013 (link)). Fastq reads were downloaded from the European Nucleotide Archive, study accession no. ERP002440. Reads were mapped to the H. melpomene reference genome version 2 (Davey et al. 2016 (link)) using BWA-mem (Li and Durbin 2009 (link); Li 2013 ), with default parameters. Genotyping was performed using the Genome Analysis Toolkit (DePristo et al. 2011 (link)) version 3 HaplotypeCaller and GenotypeGVCFs, with default parameters except that heterozygosity was set to 0.02. Phasing and imputation was performed using Beagle version 4 (Browning and Browning 2007 (link)). Trees were inferred as described above, and weightings were computed using Twisst, with the five taxa described above.
Free full text: Click here