Thousands of specimens are available from the TCGA; we arbitrarily selected the first 10 ovarian serous carcinomas (OVCA) and 20 clear-cell renal cell carcinomas (KIRC) sample IDs as of May 2013, when sorted alphabetically. The SNP arrays for ovarian serous carcinomas and renal clear-cell carcinomas were obtained on 22 January 2010 and 17 November 2011, respectively. Exome sequence data, previously aligned to the human genome version hg19, was obtained in BAM format in May 2013.
The SNP array files were preprocessed using the aroma.affymetrix package [15 (link)] as described [16 (link)], and copy number variations were determined using ASCAT version 2.1 [3 (link)]; sex chromosomes were excluded from the analysis.
The Sequenza results were obtained using version 2.1.0 with default parameters; the input was generated by the python script sequenza-utils.py version 2.1.0 with default binning size of 50 bases for the exome sequencing or 200 bases for the whole-genome sequencing. The absCN-seq results were obtained using version 1.0 with default parameters; the input was the same genomic segments used by Sequenza as well as high-quality somatic mutations calls detected by VarScan2 as described in the software documentation. The ABSOLUTE results were obtained using software version 1.0.6 with default parameters except that the platform was specified as ‘Illumina_WES’; the input was the same genomic segments used with Sequenza and absCN-seq.
Exome sequencing data from 31 of the NCI-60 tumor cell lines, aligned to the genome version hg19, were downloaded in May 2014 in the BAM format [17 (link)].
Whole-genome sequencing, aligned to the hg19 genome in the BAM format at ×30 of coverage, of two cell lines HCC1143 and HCC1954, matching normal blood, and simulated admixtures at tumor cellularity of 20%, 40%, 60%, and 80%, were obtained in March 2014 from the TCGA4 benchmark cohort (https://cghub.ucsc.edu/datasets/benchmark_download.html).
All BAM files were processed to remove PCR duplicates and low-quality mappings with Picard, and then converted to pileup format using SAMtools [12 (link)].