The above decomposition (Equation 2) suggests the following simple strategy for statistical learning of causal networks. First, by multiple testing of = 0 we determine the network topology, i.e. we identify those edges for which the corresponding partial correlation is not vanishing. Second, by subsequent multiple testing of log( ) = 0 we establish a partial ordering of the nodes, which in turn imposes a partial directionality upon the edges.
In more detail, we propose the following five-step algorithm:
1. First, it is essential to determine an accurate and positive definite estimateR of the correlation matrix. Only if the sample size is large with many more observations than variables (n > > p) the usual empirical correlation estimate will be suitable. In all other instances, the use of a regularized estimator is absolutely vital (e.g., the Stein-type shrinkage estimator of [20 ]) in order to improve efficiency and to guarantee positive definiteness. In addition, if the samples are longitudinal it may be necessary to adjust for autocorrelation [27 ].
2. From the estimated correlations we compute the partial variances and correlations (see Table1 ), and from those in turn plug-in estimates of the factors and of Equation 2 for all possible edges. Note that in this calculation each variable assumes in turn the role of the response Y . An efficient way to calculate the various is given by taking the square root of the diagonal of the inverse of the estimated correlation matrix, and computing the corresponding pairwise ratios.
3. Subsequently, we infer the partial correlation graph following the algorithm described in [19 (link)]. Essentially, we perform multiple testing of all partial correlation coefficients . Note that for high dimensions (large p) the null distribution of partial correlations across edges can be determined from the data, which in turn allows the adaptive computation of corresponding false discovery rates [28 (link)].
4. In a similar fashion we then conduct multiple testing of all log( ). As is the ratio of two variances with the same degrees of freedom, it is implicit that log( ) is approximately normally distributed [29 ], with an unknown variance parameter θ. Thus, the observed z = log( ) across all edges follow a mixture distribution
Assuming that most z belong to the null model, i.e. that most edges are undirected, it is possible to infer non-parametrically the alternative distribution fA (z), the proportion η0, as well as the variance parameter θ – for an algorithm see [28 (link)]. From the resulting densities and distribution functions local and tail-area-based false discovery rates for the test log( ) = 0 are computed. Note that in this procedure we include all edges, regardless of the corresponding value of or the outcome of the test = 0.
5. Finally, a partially directed network is constructed as follows. All edges in the correlation graph with significant log( ) ≠ 0 are directed in such a fashion that the direction of the arrow points from the node with the larger standardized partial variance (the more "exogenous" variable) to the node with the smaller standardized partial variance (the more "endogenous" variable). The other edges with log( ) ≈ 0 remain undirected. The subgraph consisting of all directed edges constitutes the inferred causal network. Note that this does not necessarily include all nodes that are contained in the GGM network.
In more detail, we propose the following five-step algorithm:
1. First, it is essential to determine an accurate and positive definite estimate
2. From the estimated correlations we compute the partial variances and correlations (see Table
3. Subsequently, we infer the partial correlation graph following the algorithm described in [19 (link)]. Essentially, we perform multiple testing of all partial correlation coefficients . Note that for high dimensions (large p) the null distribution of partial correlations across edges can be determined from the data, which in turn allows the adaptive computation of corresponding false discovery rates [28 (link)].
4. In a similar fashion we then conduct multiple testing of all log( ). As is the ratio of two variances with the same degrees of freedom, it is implicit that log( ) is approximately normally distributed [29 ], with an unknown variance parameter θ. Thus, the observed z = log( ) across all edges follow a mixture distribution
Assuming that most z belong to the null model, i.e. that most edges are undirected, it is possible to infer non-parametrically the alternative distribution fA (z), the proportion η0, as well as the variance parameter θ – for an algorithm see [28 (link)]. From the resulting densities and distribution functions local and tail-area-based false discovery rates for the test log( ) = 0 are computed. Note that in this procedure we include all edges, regardless of the corresponding value of or the outcome of the test = 0.
5. Finally, a partially directed network is constructed as follows. All edges in the correlation graph with significant log( ) ≠ 0 are directed in such a fashion that the direction of the arrow points from the node with the larger standardized partial variance (the more "exogenous" variable) to the node with the smaller standardized partial variance (the more "endogenous" variable). The other edges with log( ) ≈ 0 remain undirected. The subgraph consisting of all directed edges constitutes the inferred causal network. Note that this does not necessarily include all nodes that are contained in the GGM network.
Full text: Click here