Genotyping procedures can be found in the primary reports for each cohort (summarized in Supplementary Table 3 ). Individual genotype data for all PGC29 samples, GERA, and iPSYCH were processed using the PGC “ricopili” pipeline (URLs) for standardized quality control, imputation, and analysis19 (link). The cohorts from deCODE, Generation Scotland, UK Biobank, and 23andMeD were processed by the collaborating research teams using comparable procedures. SNPs and insertion-deletion polymorphisms were imputed using the 1000 Genomes Project multi-ancestry reference panel (URLs)86 (link). More detailed information on sample QC is provided in the Supplementary Note .
Linkage disequilibrium (LD) score regression (LDSC) 22 (link),24 (link) was used to estimate
from GWA summary statistics. Estimates of
on the liability scale depend on the assumed lifetime prevalence of MDD in the population (K), and we assumed K=0.15 but also evaluated a range of estimates of K to explore sensitivity including 95% confidence intervals (Supplementary Fig. 1 ). LDSC bivariate genetic correlations attributable to genome-wide SNPs (rg) were estimated across all MDD and major depression cohorts and between the full meta-analyzed cohort and other traits and disorders.
LDSC was also used to partition
by genomic features24 (link),46 (link). We tested for enrichment of
based on genomic annotations partitioning
proportional to bp length represented by each annotation. We used the “baseline model” which consists of 53 functional categories. The categories are fully described elsewhere46 (link), and included conserved regions47 (link), USCC gene models (exons, introns, promoters, UTRs), and functional genomic annotations constructed using data from ENCODE 87 (link) and the Roadmap Epigenomics Consortium88 (link). We complemented these annotations by adding introgressed regions from the Neanderthal genome in European populations89 (link) and open chromatin regions from the brain dorsolateral prefrontal cortex. The open chromatin regions were obtained from an ATAC-seq experiment performed in 288 samples (N=135 controls, N=137 schizophrenia, N=10 bipolar, and N=6 affective disorder)90 . Peaks called with MACS91 (link) (1% FDR) were retained if their coordinates overlapped in at least two samples. The peaks were re-centered and set to a fixed width of 300bp using the diffbind R package92 (link). To prevent upward bias in heritability enrichment estimation, we added two categories created by expanding both the Neanderthal introgressed regions and open chromatin regions by 250bp on each side.
We used LDSC to estimate rg between major depression and a range of other disorders, diseases, and human traits22 (link). The intent of these comparisons was to evaluate the extent of shared common variant genetic architectures in order to suggest hypotheses about the fundamental genetic basis of major depression (given its extensive comorbidity with psychiatric and medical conditions and its association with anthropometric and other risk factors). Subject overlap of itself does not bias rg. These rg are mostly based on studies of independent subjects and the estimates should be unbiased by confounding of genetic and non-genetic effects (except if there is genotype by environment correlation). When GWA studies include overlapping samples, rg remains unbiased but the intercept of the LDSC regression is an estimate of the correlation between association statistics attributable to sample overlap. These calculations were done using the internal PGC GWA library and with LD-Hub (URLs)60 (link).
from GWA summary statistics. Estimates of
on the liability scale depend on the assumed lifetime prevalence of MDD in the population (K), and we assumed K=0.15 but also evaluated a range of estimates of K to explore sensitivity including 95% confidence intervals (
LDSC was also used to partition
by genomic features24 (link),46 (link). We tested for enrichment of
based on genomic annotations partitioning
proportional to bp length represented by each annotation. We used the “baseline model” which consists of 53 functional categories. The categories are fully described elsewhere46 (link), and included conserved regions47 (link), USCC gene models (exons, introns, promoters, UTRs), and functional genomic annotations constructed using data from ENCODE 87 (link) and the Roadmap Epigenomics Consortium88 (link). We complemented these annotations by adding introgressed regions from the Neanderthal genome in European populations89 (link) and open chromatin regions from the brain dorsolateral prefrontal cortex. The open chromatin regions were obtained from an ATAC-seq experiment performed in 288 samples (N=135 controls, N=137 schizophrenia, N=10 bipolar, and N=6 affective disorder)90 . Peaks called with MACS91 (link) (1% FDR) were retained if their coordinates overlapped in at least two samples. The peaks were re-centered and set to a fixed width of 300bp using the diffbind R package92 (link). To prevent upward bias in heritability enrichment estimation, we added two categories created by expanding both the Neanderthal introgressed regions and open chromatin regions by 250bp on each side.
We used LDSC to estimate rg between major depression and a range of other disorders, diseases, and human traits22 (link). The intent of these comparisons was to evaluate the extent of shared common variant genetic architectures in order to suggest hypotheses about the fundamental genetic basis of major depression (given its extensive comorbidity with psychiatric and medical conditions and its association with anthropometric and other risk factors). Subject overlap of itself does not bias rg. These rg are mostly based on studies of independent subjects and the estimates should be unbiased by confounding of genetic and non-genetic effects (except if there is genotype by environment correlation). When GWA studies include overlapping samples, rg remains unbiased but the intercept of the LDSC regression is an estimate of the correlation between association statistics attributable to sample overlap. These calculations were done using the internal PGC GWA library and with LD-Hub (URLs)60 (link).