Family-Based Association Testing for Multiple Markers

We first consider a sample of trios - one offspring with information on both parents available and review the single variant setting. The general FBAT statistic is a covariance between the offspring genotype and trait. Let and denote the genotype for the variant and the trait, respectively, for the offspring. In the general case, can be both measured or dichotomous, and we can use an offset to appropriately center the trait [17] (link). For family samples with dichotomous traits such as affected trios or discordant sibpairs, is often taken to be zero; with measured outcomes, mean of the outcome is usually chosen for offset. For the additive model, is the number of copies of minor alleles for the locus of interest. We define
in (1) is computed using Mendel’s laws under the null hypothesis of no association and conditional on the trait as well as the parental genotypes (denoted as for the i-th family). Under the same conditional distribution, we can compute Var ; the large sample FBAT statistic is defined as where . Under the null hypothesis of no association Z is approximately N(0,1). The formula extends easily where multiple offspring are sampled in a family for testing the null hypothesis of no association and no linkage.
The FBAT Multi-Marker test is a multivariate extension of the univariate FBAT test designed to simultaneously test a set of markers in a defined region, such as a gene. It belongs to the general class of ‘gene-based tests’ since a set of M univariate tests in a gene are replaced by a single multivariate test. Let and denote the statistics in equation 1 and 2, defined for the marker. Assuming large samples to obtain sufficient heterozygote parents, each is approximately N(0,1), but the M markers may be correlated because of linkage disequilibrium in the region. Provided we have an estimate of the correlation matrix, we can obtain a M degree of freedom test of the null hypothesis of no association between any of the M variants and the disease, versus the alternative that at least one marker is in LD with a disease locus.
Rakovski et al [15] (link) estimate the correlation matrix empirically as follows: Let be the vector of FBAT statistics, which forms the basis of the multimarker test. Let , the empirical variance estimator, be the matrix with elements and be the diagonal matrix with elements equal to the Var( )’s where . The corresponding adjusted variance matrix is defined by

Note that is a variance-covariance matrix, with all elements estimated empirically. However the diagonal elements of can be calculated directly provided there is no linkage between any marker and the true disease locus. is an ‘adjusted’ variance covariance matrix which replaces the empirical variances with the exact ones. The multi-marker test is then defined as
In large samples, T will be approximately distributed with degrees of freedom equal to the rank of . The asymptotic normality relies on the asymptotic normality of each marker test , and may not be valid in the rare variant setting.
Several papers have noted that tests of multiple markers can be greatly improved upon by taking optimal linear combinations of the individual tests [8] (link), [16] (link), [18] (link), [19] , but a major issue is determining the optimal weights, since the optimal weights depend upon the unknown effect of each marker. Xu et al [16] (link) proposed a method to handle this problem by using that portion of the family data that is not used in constructing the FBAT statistics, e.g. the noninformative families [13] (link),[20] (link). The approach is designed for measured outcomes, or at least cases where both affected and unaffected offspring are sampled. The approach can be extended in principle to the setting where we have only affected trios [21] (link), but this is beyond the scope of this paper. An additional feature of the FBAT-LC approach is that estimation of the weights can be invalidated by population substructure.

Free full text: Click here

De G., Yip W.K., Ionita-Laza I, & Laird N. (2013). Rare Variant Analysis for Family-Based Design. PLoS ONE, 8(1), e48495.

Publication 2013

Alleles Freedom association Gene Gene tests Genotypes Heterozygote Parents Trios Vector

Corresponding Organization :

Other organizations : Harvard University, Columbia University

Top 5 similar protocols

Protocol cited in 6 other protocols

Variable analysis

independent variables

Genotype for the variant

dependent variables

Trait
Offspring genotype

control variables

Parental genotypes

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!