Credible Set Analysis of Height GWAS

For Fig. 4d, e and the credible set analysis we used autosomal markers only, and filtered markers in each data source such that MAF > 0.001 (defined in the GWAS population), and Info score > 0.3 in the UK Biobank imputed data. There were 16,443,622 such markers in UK Biobank imputed data, 703,946 in the UK Biobank genotyped data, and 2,546,872 in GIANT.
For a given phenotype, the 95% credible set in a region of association is the smallest set of markers that together have 95% posterior probability of containing the marker causally associated with the phenotype. We found credible sets for standing height using the method described previously^{33 (link)} and summarize the results in Extended Data Fig. 6. It is important to note that this approach is based on a model in which there is exactly one causal marker in the region and genotypes for that marker are available in the data. Our results should therefore be considered as indicative of a more detailed analysis where, for example, the regions are first analysed to distinguish independent association signals.
In our analysis, we first defined a set of 575 non-overlapping regions associated with standing height using a procedure based on that used previously^{15 (link)} (see Supplementary Information). For each study, we carried out two separate analyses to find credible sets in these regions: (A) using all the markers in each study (768,502 in UK Biobank imputed data; 106,263 in GIANT); and (B) using only those markers in both studies (105,421).
For each marker in each study, we computed a Bayes factor in favour of association with standing height using the effect sizes and standard errors, and 0.2² as the prior^{33 (link)} on the variance of the effect sizes. To ensure the effect sizes were on the same scale in both studies we scaled UK Biobank effect sizes and standard errors by the standard deviation of the residuals of the measured phenotype (standing height) after regressing out the covariates used in the GWAS. We then confirmed that the effect size estimates for overlapping markers were comparable between the two studies.
If there is exactly one causal marker in the region and genotypes for that marker are available in the data, then the posterior probability that a marker i drives the association signal in the region r is given by:

π_{i r} = \frac{{BF}_{i r}}{Σ_{k} {BF}_{k r}}

where BF_kr is the Bayes factor for marker i in the r region^{33 (link)}. The 95% credible set for a region is found by going down the list of markers ordered from highest to lowest posterior probability and stopping when the cumulative posterior reaches 0.95.
We assessed the sensitivity of our results to the choice of prior by conducting the same analyses using a much smaller prior (0.02²) and much larger prior (20²). We found that overall the choice of prior had little effect on the results. Specifically for values we report in the main text, the median credible set sizes were unaffected in all analyses. For the larger prior, the number of single-marker credible sets was unaffected except for analysis B in UK Biobank (from 123 to 122), and the median proportion of markers in the credible set was unaffected in all analyses. For the smaller prior, the number of single-marker credible sets only changed for analysis A, going from 78 to 75 in GIANT, and 85 to 86 in UK Biobank, and the median proportion of markers in the credible set increased slightly in all analyses (maximum increase from 0.047 to 0.051).

Free full text: Click here

Bycroft C., Freeman C., Petkova D., Band G., Elliott L.T., Sharp K., Motyer A., Vukcevic D., Delaneau O., O’Connell J., Cortes A., Welsh S., Young A., Effingham M., McVean G., Leslie S., Allen N., Donnelly P, & Marchini J. (2018). The UK Biobank resource with deep phenotyping and genomic data. Nature, 562(7726), 203-209.

Publication 2018

A factor Factor a Factor i Genotypes Giant Gwas Phenotype Sensitivity

Corresponding Organization :

Other organizations : Wellcome Centre for Human Genetics, University of Oxford, University of Melbourne, University of Geneva, Illumina (United Kingdom), UK Biobank

Top 5 similar protocols

Protocol cited in 164 other protocols

Variable analysis

independent variables

MAF > 0.001 (defined in the GWAS population)
Info score > 0.3 in the UK Biobank imputed data

dependent variables

Standing height

control variables

Covariates used in the GWAS

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!