We tested medulloblastoma samples with the Illumina HumanMethylation450K DNA methylation array (Illumina, San Diego, CA, USA). The Gene Expression Omnibus accession number for 450K DNA methylation array profiles we used for the determination of human medulloblastoma molecular subgroup status is GSE93646.
To identify methylation-dependent subgroups, we did unsupervised class discovery by NMF-metagene and k-means clustering, testing all combinations of 3–10 metagenes and clusters for reproducibility using bootstrapped resampling methods (250 iterations) as described previously.
7 (
link) This analysis identified metagenes (a single score that reflects the methylation status of several CpG loci) representing the main biological effects present in the genome-wide dataset. We assessed cluster stability using the cophenetic index, a shorthand measure of the robustness of sample clustering as determined by consensus non-negative matrix factorisation (
appendix p 3). We visualised clusters with t-SNE.
22 We assigned samples classified with less than 80% confidence (by resampling procedures) as non-classifiable (NC;
appendix pp 2–3).
We projected metagenes derived from our discovery cohort onto the validation cohort. Additionally, we combined the discovery and validation cohorts to do equivalent consensus clustering.
We assessed established medulloblastoma clinical, pathological, and molecular features as described previously.
7 (
link) Briefly, we defined histopathological variants according to the WHO 2016 guidelines.
13 (
link) We assigned metastatic status (M+) based on Chang's criteria (
appendix p 3). Tumours were designated as R+ if their residuum after surgical excision exceeded 1·5 cm
2. Pathology was centrally reviewed by three experienced neuropathologists for 380 (89%) of 428 samples, and clinical data were collated from contributing centres and reviewed centrally (
appendix p 3). We assessed
MYC and
MYCN status by fluorescence in situ hybridisation or copy-number estimates from methylation array. We assessed
TP53, CTNNB1, and
TERT mutation status by Sanger sequencing. We identified subgroup-specific differentially methylated CpG loci or methylated regions (DMRs) using limma or DMRcate23 (
link), 24 (
link) (
appendix p 3). RNA-seq expression data were generated for discovery cohort samples for which mRNA of sufficient quantity and quality was available. We identified subgroup-specific differentially expressed genes using DESeq2,
25 (
link) and these genes were included in ontology enrichment analyses (
appendix p 4). We identified
GFI1 mutations from RNA-seq data (
appendix p 4).
MB
SHH mutation data were obtained from a previous study.
26 (
link) Although 450K methylation data for MB
SHH subgroup assignment were not available for this sample cohort, the tightly defined age cutoff that we defined between the molecularly determined MB
SHH-Infant and MB
SHH-Child subgroups enabled us to infer subgroups for this sequencing cohort (
appendix p 4).
26 (
link) We tested recurrent MB
SHH mutations (
TP53, SUFU, PTCH1, SMO, and
TERT) and gene amplifications (
MYCN and
GLI2) identified by whole genome sequencing, for association with the age-defined MB
SHH-Child or MB
SHH-Infant subgroups using Fisher's exact test (
appendix p 4).