With a single genetic variant j, the causal effect of the exposure on the outcome can be estimated using the ratio method (or Wald method17 ) as the coefficient from regression of the outcome on the genetic variant (denoted by Γ^j ) divided by the coefficient from regression of the exposure on the variant (denoted γ^j ).18 (link) The reduced-form equation relating the outcome to the genetic variant j can be written as:
Yi=ΓjGij+ϵijY=(αj+βγj)Gij+ϵijY. 
If the genetic variant is a valid IV, αj=0 and the ratio method estimand (the quantity that is being estimated, denoted by βj ) is Γjγj=βγjγj=β .
With multiple genetic variants, the causal effect of the exposure on the outcome can be estimated using the TSLS method.19 The TSLS estimate is a weighted average of the ratio estimates calculated using each genetic variant in turn.20 If the genetic variants are uncorrelated (in linkage equilibrium), then the causal effect can be estimated from summarized data on the genetic associations with the exposure and with the outcome as:21
j=1Jγ^j2σYj2β^jj=1Jγ^j2σYj2.
where β^j=Γ^jγ^j is the ratio method estimate for variant j, and σYj is the standard error in the regression of the outcome on the jth genetic variant, assumed to be known. This same weighted average formula is used in a fixed-effect meta-analysis, where the IV-specific causal estimates β^j are the study-specific estimates, and the weights are the inverse-variance weights.22 This summarized estimate, which we refer to as an inverse-variance weighted (IVW) estimate, will differ slightly from the TSLS estimate in finite samples, as the correlation between independent genetic variants will not exactly equal zero,23 (link) but the two estimates will be equal asymptotically (that is, they both tend towards the same quantity as their sample sizes increase towards infinity). However, an advantage of the IVW estimate is that it can be calculated from summarized data, whereas the TSLS estimate requires individual-level data. We assume for the remainder of the manuscript that the genetic variants are uncorrelated in their distributions (that is, knowledge of one does not help to predict the value of any other), as typically in Mendelian randomization one variant is taken from each gene region. Distantly located variants are usually uncorrelated; correlations between variants that are physically close can be found using an online tool such as [http://www.broadinstitute.org/mpg/snap/ldsearchpw.php].
If genetic variant j is not a valid IV, in particular because it has a direct effect on the outcome ( αj0 ), then we have βj=β+αjγj . The ratio estimate based on genetic variant j in an infinite sample will equal the true causal effect β plus an error term αjγj . In the same way, the TSLS and IVW estimates will tend towards:
β+j=1JγjσYj2αjj=1Jγj2σYj2=β+Bias(α,γ).
This implies that the TSLS estimate is consistent when the assumption IV3 is true and all the αj parameters are zero. It is also consistent if the pleiotropic effects happen to cancel out, such that the bias term is equal to zero.24 (link) Although this will not be universally plausible, we explore the condition that the correlation between the genetic associations with the exposure (the γj parameters) and the direct effects of the genetic variants on the outcome (the αj parameters) is zero. We refer to the condition that the distributions of these parameters are independent as InSIDE (Instrument Strength Independent of Direct Effect). It can be viewed as a weaker version of the exclusion restriction assumption. This relaxation of the IV assumptions was recently investigated by Kolesár et al.,25 although their work differs from ours and is not presented within the context of Mendelian randomization.
Free full text: Click here