We tested medulloblastoma samples with the Illumina HumanMethylation450K DNA methylation array (Illumina, San Diego, CA, USA). The Gene Expression Omnibus accession number for 450K DNA methylation array profiles we used for the determination of human medulloblastoma molecular subgroup status is GSE93646.
To identify methylation-dependent subgroups, we did unsupervised class discovery by NMF-metagene and k-means clustering, testing all combinations of 3–10 metagenes and clusters for reproducibility using bootstrapped resampling methods (250 iterations) as described previously.7 (link) This analysis identified metagenes (a single score that reflects the methylation status of several CpG loci) representing the main biological effects present in the genome-wide dataset. We assessed cluster stability using the cophenetic index, a shorthand measure of the robustness of sample clustering as determined by consensus non-negative matrix factorisation (appendix p 3). We visualised clusters with t-SNE.22 We assigned samples classified with less than 80% confidence (by resampling procedures) as non-classifiable (NC; appendix pp 2–3).
We projected metagenes derived from our discovery cohort onto the validation cohort. Additionally, we combined the discovery and validation cohorts to do equivalent consensus clustering.
We assessed established medulloblastoma clinical, pathological, and molecular features as described previously.7 (link) Briefly, we defined histopathological variants according to the WHO 2016 guidelines.13 (link) We assigned metastatic status (M+) based on Chang's criteria (appendix p 3). Tumours were designated as R+ if their residuum after surgical excision exceeded 1·5 cm2. Pathology was centrally reviewed by three experienced neuropathologists for 380 (89%) of 428 samples, and clinical data were collated from contributing centres and reviewed centrally (appendix p 3). We assessed MYC and MYCN status by fluorescence in situ hybridisation or copy-number estimates from methylation array. We assessed TP53, CTNNB1, and TERT mutation status by Sanger sequencing. We identified subgroup-specific differentially methylated CpG loci or methylated regions (DMRs) using limma or DMRcate23 (link), 24 (link) (appendix p 3). RNA-seq expression data were generated for discovery cohort samples for which mRNA of sufficient quantity and quality was available. We identified subgroup-specific differentially expressed genes using DESeq2,25 (link) and these genes were included in ontology enrichment analyses (appendix p 4). We identified GFI1 mutations from RNA-seq data (appendix p 4).
MBSHH mutation data were obtained from a previous study.26 (link) Although 450K methylation data for MBSHH subgroup assignment were not available for this sample cohort, the tightly defined age cutoff that we defined between the molecularly determined MBSHH-Infant and MBSHH-Child subgroups enabled us to infer subgroups for this sequencing cohort (appendix p 4).26 (link) We tested recurrent MBSHH mutations (TP53, SUFU, PTCH1, SMO, and TERT) and gene amplifications (MYCN and GLI2) identified by whole genome sequencing, for association with the age-defined MBSHH-Child or MBSHH-Infant subgroups using Fisher's exact test (appendix p 4).
Free full text: Click here