In order to identify consistently differentially expressed genes specific to TNBC compared to other types of breast cancer, we explored the publicly available transcriptomics data repository of the National Center for Biotechnology Information (NCBI), the Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/, accessed on 2 January 2023)—a genomic data repository—for datasets of patients with breast cancer. For consistency, we selected publicly available datasets which can be analyzed using GEO2R, a built-in platform within NCBI GEO, to carry out differential gene expression analysis on microarray data. This platform utilizes the computer language R and the limma statistical package to carry out various statistical calculations, such as the empirical Bayes statistics, to identify genes that are differentially expressed between different patient groups.
The inclusion criteria for the datasets were: human sample sources, data type was expression profiling by microarray, and datasets had breast cancer patients with TNBC patients included. A total of nine datasets (n = 1027; TNBC n = 207) were used for analysis (Table 1). Patients of each respective dataset were grouped into two groups: a TNBC group and non-TNBC group. Figure 1 illustrates a simplified flowchart of the re-analysis process.
Free full text: Click here