TCGA BRCA RNA-seq RSEM gene-level counts of 1211 cases, including 1097 primary solid tumor tissues and 114 solid tissue normal, aligned to the hg19 reference genome were downloaded from GDC’s legacy archive. The 192 TNBC samples identified in the previous section were extracted and normalized (TCGAanalyze_Normalization) using the R/Bioconductor package TCGAbiolinks (ver.2.9.5)62 (link). Next, we performed sample normalization adjusting for GC-content and upper-quantile between-lane by applying and implementing the EDASeq protocol63 (link). For the 122 prospective CPTAC BRCA samples64 (link), we obtained the median-normalized gene expression data (log2 FPKM) from linkedomics (http://linkedomics.org/data_download/CPTAC-BRCA/). RNA expression for the 28 TNBC tumors identified using the “genomic-guided identification of TNBC specimens” section were normalized and subtyped as detailed in methods section “TNBC subtyping” below. MET500 (FPKM) RNAseq data (n = 868) was retrieved from https://xenabrowser.net/16 (link). We identified 92 unique breast tumors within the MET500 dataset, among them 40 were classified as TNBC samples that underwent TNBC subtyping. Complete normalized expression data for 1981 METABRIC BRCA samples profiled with Illumina HT 12 arrays were obtained through Synapse (https://www.synapse.org/#!Synapse:syn1757063).
Free full text: Click here