We benchmarked SpeedSeq’s processing time using the NA12878 genome from the Illumina Platinum Genomes dataset (European Nucleotide Archive: ERP001960), which comprises 50× WGS datasets for each of the 17 members of the three-generation CEPH 1463 pedigree (Supplementary Fig. 4).
Whole-genome sequencing data from five matched tumor-normal pairs and their orthogonally validated somatic mutations were obtained from The Cancer Genome Atlas (TCGA). These included three colorectal tumors (TCGA-A6-6141, TCGA-CA-6718, TCGA-D5-6540), one ovarian tumor (TCGA-13-0751), and one breast tumor (TCGA-B6-A0I6). Raw FASTQ reads were down-sampled to 50× coverage in the tumor and 30× coverage in the normal sample. Samples were processed with SpeedSeq for alignment, somatic mutations, and structural variants using default parameters and then loaded into GEMINI for variant interpretation. We also analyzed WGS data from a tumor-normal pair (63× tumor, 49× normal coverage) of a patient with an invasive breast carcinoma (TCGA-E2-A14P) containing a previously reported gene fusion between TBL1XR1 and PIK3CA20 .