The ALS patient dataset was obtained from the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/) and ArrayExpress database (https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-940/), and the inclusion criteria for the candidate dataset were: ALS, Human gene expression profile, availability of follow-up information (survival information), and related clinical data. GSE112676 and GSE112680, gene expression data were obtained from two microarray platforms (Illumina HumanHT-12 V3.0 and HumanHT-12 V4.0 expression beadchip arrays), were included in our study. GSE112676 (n = 233 ALS and 508 controls) and GSE112680 (n = 164 ALS, 75 mimics, and 137 controls) contained a total of 397 whole blood gene expression data from ALS patients, as described in previous studies (van Rheenen et al., 2018 (link); Swindell et al., 2019 (link)). The clinical features of the 397 ALS patients are presented in Table 1. The cases were eventually randomly grouped into a training cohort and a validation cohort for bioinformatics analysis based on the ratio of 6:4. The E-TABM-940 dataset (n = 57 ALS and 23 controls) from ArrayExpress database was used as the external validation cohort (Lincecum et al., 2010 (link)). As shown in Figure 1, our study design was briefly described in the flow chart.
Free full text: Click here