Two-sample univariable MR was performed for each protein separately
using summary statistics in the inverse-variance weighted method adapted to
account for correlated variants
65 (link),66 (link). For each of
G genetic variants (
g=1, …,
G) having per-allele estimate of the association with the
protein
βXg and standard error
σXg, and per-allele estimate of the
association with the outcome (here, AD or CHD)
βYg and standard error
σYg, the IV estimate
is obtained from generalized weighted linear
regression of the genetic associations with the outcome
(
βY) on the genetic associations with
the protein (
βX) weighting for the precisions
of the genetic associations with the outcome and accounting for correlations
between the variants according to the regression model:
where
βy and
βx are vectors of the univariable
(marginal) genetic associations, and the weighting matrix Ω has terms
Ω
g1g2 =
σYg1σYg2ρg1g2,
and
ρg1g2is the correlation between the
g1th and
g2th variants.
The IV estimate from this method is:
and the standard error is:
where
T is a matrix
transpose. This is the estimate and standard error from the regression model
fixing the residual standard error to 1 (equivalent to a fixed-effects model in
a meta-analysis).
Genetic variants in univariable MR need to satisfy three key assumptions
to be valid instruments: (1) the variant is associated with the risk factor of
interest (that is, the protein level), (2) the variant is not associated with
any confounder of the risk factor-outcome association, and (3) the variant is
conditionally independent of the outcome given the risk factor and
confounders.
To account for potential effects of functional pleiotropy
67 (link), we performed multivariable MR
using the weighted regression-based method proposed by Burgess et al.
68 (link). For each of
K risk factors in the model (
k =
1,…,
K), the weighted regression-based method is
performed by multivariable generalized weighted linear regression of the
association estimates
βY on each of the
association estimates with each risk factor
βXk in a single regression model:
where
βX1 is the vectors of the
univariable genetic associations with risk factor 1, and so on. This regression
model is implemented by first pre-multiplying the association vectors by the
Cholesky decomposition of the weighting matrix, and then applying standard
linear regression to the transformed vectors. Estimates and standard errors are
obtained fixing the residual standard error to be 1 as above.
The multivariable MR analysis allows the estimation of the causal effect
of a protein on disease outcome accounting for the fact that genetic variants
may be associated with multiple proteins in the region. Causal estimates from
multivariable MR represent direct causal effects, representing the effect of
intervening on one risk factor in the model while keeping others constant.
Sun B.B., Maranville J.C., Peters J.E., Stacey D., Staley J.R., Blackshaw J., Burgess S., Jiang T., Paige E., Surendran P., Oliver-Williams C., Kamat M.A., Prins B.P., Wilcox S.K., Zimmerman E.S., Chi A., Bansal N., Spain S.L., Wood A.M., Morrell N.W., Bradley J.R., Janjic N., Roberts D.J., Ouwehand W.H., Todd J.A., Soranzo N., Suhre K., Paul D.S., Fox C.S., Plenge R.M., Danesh J., Runz H, & Butterworth A.S. (2018). Genomic atlas of the human plasma proteome. Nature, 558(7708), 73-79.