All the sequencing and clinical data associated with TNBC patients (nā€‰=ā€‰180) were obtained from The Cancer Genome Atlas (TCGA) hosted by the Genomic Data Commons (GDC). Patients (nā€‰=ā€‰180) were identified as triple negative breast cancer patients based on the analysis performed by Lehmann et al.21 (link). RNA-Seq and miRNA-Seq data were downloaded as raw counts. The CNV data from Affymetrix SNP 6.0 array was processed by GDC into output files containing segment mean values (transformed copy number values for each of the segmented genomic regions), which were used for this work. DNA methylation levels (as beta values) from Illumina Infinium Human Methylation 450 arrays (if available) were also obtained. (More information on GDC output files and pipelines can be found on the GDC website: https://docs.gdc.cancer.gov/.) Clinical metadata (Supplementary Table S1) were extracted from clinical XML files provided by GDC.
Purity estimates for the 180 TNBC tumors were obtained from Aran et al.55 (link) and Li et al.30 (link). Patients were kept in the study if they had a consensus purity estimate (CPE) (from Aran et al.) or Clonal Heterogeneity Analysis Tool (CHAT) (from Li et al.) purity measurement of 60% or higher. The purity scores are presented in Supplementary Table S1.
Free full text: Click here