In the current study, the DNA methylation profiling (Illumina Human Methylation 450K Bead Chip Array) and gene expression datasets (Illumina HiSeq RNA Seq V2) were downloaded from TCGA database. A total of 276 KIRP and 45 control specimens were enrolled in the methylation dataset, while there were 289 cases and 32 controls in the gene expression dataset. Both datasets contain clinical data, including survival time, status, gender, age, and clinical stage. The clinical information of methylation data is shown in Table 1. Other DNA methylation data were retrieved from the Gene Expression Omnibus (GEO) database (GSE126441, n = 14), which were used to validate the methylated level of signature genes. To improve the data accuracy, we preprocessed both datasets, including removing the sites in which 70% of the methylated level were NA, and genes with missing expression values in >30% of the patients. Genes with RPKM expression values of 0 in all samples were excluded [30 (link)]. The technical route to select the DNA methylated site signature is shown in Figure 1.
Free full text: Click here