We tested for association between the 528,509 genotyped SNPs and the normalised expression levels of the 17,926 probes using the FASTASSOC component of MERLIN [57] (link), [58] (link). The FASTASSOC option fits a simple linear regression model to estimate an additive effect for each probe and SNP combination, with SNP genotypes coded as the number of copies of the minor allele (0, 1 or 2) carried by each individual. We used the Lander-Green algorithm [58] (link), [59] (link), implemented in Merlin, to estimate expected genotype scores for individuals with missing genotype data. Covariates of sex and generation were included in the model, where generation denotes either the parental or the adolescent generation. Previous analysis has shown (not published) that generation is a useful substitute for age without the burden of additional degrees of freedom. The model applies a variance component approach to account for the correlations between different expression levels within each family. The model fit is evaluated using a score test, which substantially reduces computational time compared to maximum-likelihood methods, at the expense of a slight loss of power [58] (link).
Conditional regression analysis was used to address the potential to miss secondary eQTL in linkage disequilibrium (LD) with other eQTL. For each probe with an identified eQTL we corrected for the main effects of the top eSNP (SNP with the largest R2) by regressing its genotypes against the expression levels. Residuals from this analysis were then used for second round of eQTL mapping, allowing us to detect independent eQTL. If additional eQTL were identified from this second round of analysis, the process was repeated, correcting for the main effects of the top eSNP from the first and second eQTL using multivariate regression.
Associations were evaluated in two categories depending on the location of the SNP relative to the transcription start site (TSS). Cis-eQTL were defined as associations between SNPs within 2MB of either the 3′ or 5′ end of the TSS. We defined trans-associations as associations involving SNPs elsewhere in the genome. To correct for multiple testing, we used a study-wide significance level of 0.05, corrected for the number of SNP by probe associations tested, corresponding to a p-value threshold of 5.25×10−12.
We tested for the effects of population structure and cryptic relatedness between individuals by applying the method ‘genomic control’ [60] (link) to results of the association analysis. We derived a coefficient of 1.002, indicating negligible population stratification.
Free full text: Click here