If the genetic variant is a valid IV, and the ratio method estimand (the quantity that is being estimated, denoted by ) is .
With multiple genetic variants, the causal effect of the exposure on the outcome can be estimated using the TSLS method.19 The TSLS estimate is a weighted average of the ratio estimates calculated using each genetic variant in turn.20 If the genetic variants are uncorrelated (in linkage equilibrium), then the causal effect can be estimated from summarized data on the genetic associations with the exposure and with the outcome as:21
where is the ratio method estimate for variant j, and is the standard error in the regression of the outcome on the jth genetic variant, assumed to be known. This same weighted average formula is used in a fixed-effect meta-analysis, where the IV-specific causal estimates are the study-specific estimates, and the weights are the inverse-variance weights.22 This summarized estimate, which we refer to as an inverse-variance weighted (IVW) estimate, will differ slightly from the TSLS estimate in finite samples, as the correlation between independent genetic variants will not exactly equal zero,23 (link) but the two estimates will be equal asymptotically (that is, they both tend towards the same quantity as their sample sizes increase towards infinity). However, an advantage of the IVW estimate is that it can be calculated from summarized data, whereas the TSLS estimate requires individual-level data. We assume for the remainder of the manuscript that the genetic variants are uncorrelated in their distributions (that is, knowledge of one does not help to predict the value of any other), as typically in Mendelian randomization one variant is taken from each gene region. Distantly located variants are usually uncorrelated; correlations between variants that are physically close can be found using an online tool such as [
If genetic variant j is not a valid IV, in particular because it has a direct effect on the outcome ( ), then we have . The ratio estimate based on genetic variant j in an infinite sample will equal the true causal effect β plus an error term . In the same way, the TSLS and IVW estimates will tend towards:
This implies that the TSLS estimate is consistent when the assumption IV3 is true and all the parameters are zero. It is also consistent if the pleiotropic effects happen to cancel out, such that the bias term is equal to zero.24 (link) Although this will not be universally plausible, we explore the condition that the correlation between the genetic associations with the exposure (the parameters) and the direct effects of the genetic variants on the outcome (the parameters) is zero. We refer to the condition that the distributions of these parameters are independent as InSIDE (Instrument Strength Independent of Direct Effect). It can be viewed as a weaker version of the exclusion restriction assumption. This relaxation of the IV assumptions was recently investigated by Kolesár et al.,25 although their work differs from ours and is not presented within the context of Mendelian randomization.