Inflammatory bowel disease subtypes. As base sample, we used publicly available summary statistics from a case/control inflammatory bowel disease GWAS [40 (link)]. The SNP effect sizes of this GWAS were used to calculate pathway and genome-wide PRS for each individual in the target sample, composed by UK Biobank participants diagnosed with Crohn’s disease and with ulcerative colitis. The target sample phenotype was encoded as individuals with Crohn’s disease vs individuals with ulcerative colitis.
Bipolar disorder subtypes. We obtained access to individual genotype data from 55 bipolar disorder cohorts collected by the PGC Bipolar Disorder Working group (Table K in S1 Tables). Quality control, imputation and harmonisation was performed on this data as previously described [41 (link)]. Out of the 55 cohorts, we selected 34 as base sample and meta-analysed each cohort case/control GWAS results using the software METAL (2011-03-25) [58 (link)] with the sample-size weighted fixed-effects algorithm. We used the remaining 21 cohorts as target sample and calculated for each individual with bipolar disorder pathway and genome-wide PRS. The target sample phenotype was encoded as individuals with bipolar disorder I vs bipolar disorder II.
Pseudo subtypes of paired major diseases. We obtained previously published GWAS summary statistics for four major diseases: type 2 diabetes, coronary artery disease, obesity (defined as body mass index > 30) and hypercholesterolemia (defined as low-density lipoproteins > 4.9 mmol/L) and performed a meta-analysis for each pair of traits. Meta-analyses were performed using METAL [58 (link)] with the sample-size weighted fixed-effects algorithm. To truly mimic a composite phenotype GWAS, only variants included in both GWAS summary statistics were retained. The resulting meta-analysis summary statistics were used as base sample. As target sample, we generated composite phenotypes by combining cases of the two paired phenotypes using UK Biobank. To calculate the PRS, target sample phenotypes were encoded mimicking sub-phenotypes of a given disease, for example, for the phenotype coronary artery disease-obesity, samples with coronary artery disease (and not obesity) were coded as 0 and those with obesity (and not coronary artery disease) were coded as 1 (Tables L-N in S1 Tables).
Comorbid subtypes of major diseases. For the analysis of subtypes with presence/absence of comorbid diseases, we used type 2 diabetes, coronary artery disease, obesity, hypertension and hypercholesterolemia, as these diseases present high comorbidity between them (Tables L-N in S1 Tables). As base sample, we used publicly available GWAS summary statistics for one of the diseases (e.g. type 2 diabetes). As target sample phenotypes, we defined subtypes of a disease as the presence/absence of the other disorders (e.g. type 2 diabetes with obesity vs type 2 diabetes without obesity).
Free full text: Click here