DNA was isolated using the Qiagen DNeasy Blood and Tissue Kit, following the spin-column protocol. Quality of the isolation was estimated using a NanoDrop 2000c Spectrophotometer, and samples were repeated where possible if the 260:280 nm UV absorbance ratio fell outside the range of 1.4 to 2.2. For most taxa 1 to 4 legs were used for DNA extraction, but the entire prosoma was used for Padillothoraxbadut (specimen d548) and Helpisminitabunda (specimen NZ19-9152). For the target enrichment UCE sequencing, dual-indexed TruSeq-style libraries were prepared following methods previously used in arachnids (e.g., Starrett et al. 2017 (link); Derkarabetian et al. 2018 (link); Hedin et al. 2018 (link); Kulkarni et al. 2019 (link)). Targeted enrichment was performed using either the myBaits Arachnida 1.1Kv1 (Arbor Biosciences; Faircloth 2017 (link); Starrett et al. 2017 (link)) or the Spider 2Kv1 kit (Arbor Biosciences; Kulkarni et al. 2019 (link)) following the myBaits v4.01 protocol (https://arborbiosci.com/wp-content/uploads/2018/04/myBaits-Manual-v4.pdf). Libraries were sequenced on partial lanes of Illumina NovaSeq 6000 S4 runs with 150 bp paired end reads. To the resulting set of reads we added those from two amycoid taxa, Breda and Colonus, obtained by Maddison et al. (2020) (link), to assist as outgroups. Raw demultiplexed reads were processed with Phyluce version 1.6 (Faircloth 2016 (link)), quality control and adapter removal were conducted with the Illumiprocessor wrapper (Faircloth 2013 ), and assemblies were created with SPAdes version 3.14.1 (Nurk et al. 2013 (link)), using the meta option, at default settings.
From among the contigs thus assembled, those matching particular UCE probes were pulled out using the Phyluce pipeline at default settings. Because some taxa were captured using the arachnid probeset (outgroups Attulus, Breda, Colonus, Salticus), and others using the spider probeset (remaining outgroups, and all baviines), a blended probeset file was needed to best pull out UCE contigs, because each of the arachnid and spider probesets includes loci not included by the other. Kulkarni et al.’s (2019) (link) spider probeset includes (i) some of Starrett et al.’s (2017) (link) arachnid probes directly, (ii) others for the same loci but modified to target spiders better, and (iii) others for new loci. Because Kulkarni et al. do not identify probes of the second category as such, we sought to identify whether spider probes are orthologous to arachnid probes. We then deleted from the probeset file those arachnid probes matching spider probes, as including duplicate homologs reduces data recovery (contigs matching two probes are removed by Phyluce for being problematical). To determine homology, contigs from 18 diverse species in the Salticinae, 12 captured with arachnid probes, 6 with spider probes, were each matched against both arachnid and spider probesets. Any instance of a contig matching both a spider probe and arachnid probe, as assessed using a script examining .lastz files, was taken as indicating homology between the probes. Arachnid probes that showed no such hint of homology to spider probes were then added to Kulkarni et al.’s (2019) (link) spider probeset to generate the blended probeset (see Suppl. material 1). The spider probeset includes 15015 probes and probe parts; the arachnid probeset, 14799; blended, 25689. The efficacy of the blended probeset can be seen in the numbers of loci recovered in the baviine dataset reported here: the arachnid probset pulled out on average 134 loci from spider-enriched taxa and 411 from arachnid-enriched taxa; the spider probeset pulled out on average 1118 and 113 respectively; the blended probeset pulled out on average 1123 and 415. Nonetheless, many of the UCE loci recovered from the arachnid-enriched taxa were only among those taxa; this explains why many were subsequently deleted when a filter for occupancy among ingroups (see below) was applied.
Recovered UCE loci were aligned with MAFFT (Katoh and Standley 2013 (link)) and trimmed with Gblocks (Castresana 2000 (link); Talavera and Castresana 2007 (link)), using –b1 0.5, –b2 0.5, –b3 10, –b4 4 settings in the Phyluce pipeline. Among the loci recovered, those with fewer than 6 taxa total or fewer than 3 ingroups were deleted. As in the analysis of Maddison et al. (2020) (link), loci were also deleted over concerns about paralogy if their gene tree showed a very long branch, at least 5× longer than the second longest branch.
Free full text: Click here