We performed co-localization analysis on QTLs in the eQTL Catalogue against GWAS summary statistics from 14 studies downloaded from the IEU OpenGWAS database in VCF format
98 ,99 . Our analysis included summary statistics for inflammatory bowel disease (IBD) and its two subtypes (Crohn’s disease (CD) and ulcerative colitis (UC))
100 (link); rheumatoid arthritis (RA)
101 (link), systemic lupus erythematosus (SLE)
102 (link), type 2 diabetes (T2D)
103 (link), coronary artery disease (CAD)
104 (link), LDL-cholesterol
105 , four blood cell type traits (lymphocyte count (LC), monocyte count (MC), platelet count (PLT), mean platelet volume (MPV))
34 (link) and two anthropometric traits (height, body mass index (BMI)) from the UK Biobank
105 . The variant coordinates of the GWAS summary statistics were lifted to the GRCh38 reference genome using CrossMap
57 (link). We used v.3.1 of the coloc R package
106 (link). All analysis steps are implemented in the v.21.01.1 of the eQTL Catalogue/co-localization workflow (see
URLs).
We used our uniformly processed GTEx summary statistics together with all the other summary statistics from eQTL Catalogue release 3.1. For all eQTL and GWAS dataset pairs, we performed co-localization in a ±200,000 window around each of the 62,837 fine-mapped eQTL credible set lead variants (see
Statistical fine mapping above). This ensured that co-localization was also performed separately for multiple independent eQTLs of the same gene and co-localization results were obtained in datasets in which no significant eQTL was detected for a particular gene. However, as we did not use masking or conditional analysis, many secondary eQTL co-localizations could still have been missed
18 (link),107 (link). Inspired by the study by Barbeira et al.
3 , we summarized strong co-localizations (PP4 ≥ 0.8) at the level of approximately independent LD blocks
35 (link). Positions of approximately independent LD blocks were obtained from Berisa and Pickrell
35 (link) and converted to GRCh38 coordinates using CrossMap
57 (link). If the co-localization
cis window overlapped two or more LD blocks, then the co-localizing QTL was assigned to the LD block where the QTL lead variant was located. We defined an LD block to harbor a novel co-localization signal if there was no co-localization detected within that LD block in any of the GTEx tissues. We further excluded datasets with small sample sizes (
n < 150) due to their low power to detect co-localizations.
As transcript usage, exon expression and txrevise contained many more redundant phenotypes (for example, multiple exons of the same gene), we limited co-localization analysis for those molecular traits to the significant lead QTL variants in each dataset only (false discovery rate (FDR) < 0.01), using the same ±200,000
cis window as above. To make the co-localization signals comparable across quantification methods, we also performed co-localization analysis for gene expression using significant lead QTL variants as we did for the other three quantification methods. We only included QTL and complex trait pairs with strong evidence of co-localizations (PP4 ≥ 0.8) in our analysis and summarized the results at the level of independent LD blocks, as described above. The number of LD blocks for which we detected at least one co-localizing QTL with each quantification method was visualized using the upsetR v.1.4.0 R package
108 (link).
Kerimov N., Hayhurst J.D., Peikova K., Manning J.R., Walter P., Kolberg L., Samoviča M., Sakthivel M.P., Kuzmin I., Trevanion S.J., Burdett T., Jupp S., Parkinson H., Papatheodorou I., Yates A.D., Zerbino D.R, & Alasoo K. (2021). A compendium of uniformly processed human gene expression and splicing quantitative trait loci. Nature Genetics, 53(9), 1290-1299.