Gaussian graphical model (GGM; Lauritzen, 1996 ), a network in
which edges connecting symptoms represent estimates of partial correlations.
In the GGM, edges can be understood as conditional dependence relations
among symptoms: If two symptoms are connected in the resulting graph, they
are dependent after controlling for all other symptoms. If no edge emerges,
symptoms are conditionally independent. GGMs are typically estimated using
the graphical lasso, a method that employs regularization to avoid
estimating spurious edges (Friedman et al., 2008 (link)). This method
maximizes a penalized log-likelihood, a log-likelihood function plus a
penalty term that depends on network density (the number and the strength of
edges). A tuning parameter (λ1) allows regulating the importance
of the density penalty. Larger values of λ1 yield sparser
networks (i.e., with fewer and weaker edges), whereas smaller values yield
denser networks. Because it is unknown whether the true network is sparse or
dense, the value of λ1 is typically selected empirically, using
k-fold cross-validation (i.e., train and validate the
model on different parts of the data and choose the value of λ1that results in the best prediction) or information criteria, such as the
extended Bayesian information criterion (Epskamp & Fried, 2017 ). Using
the graphical lasso to estimate a GGM improves network estimates and leads
to a sparse network that describes the data parsimoniously. The method has
been used and explained in numerous recent articles, and an accessible
tutorial article on GGM estimation and regularization is available elsewhere
(
Fried, 2017
In our case, we aimed to accurately estimate the GGMs in four groups of
individuals. If the true networks in these samples were the same, the most
accurate network would be obtained by estimating a single GGM using
graphical lasso on the full data set. However, this strategy would ignore
differences across groups. Conversely, estimating four individual networks
would allow detecting such differences but would result in poorer estimates
if the networks were the same (because of lower power in each data set
compared with the full data). The FGL (Danaher et al., 2014 (link)) is a recent
extension of graphical lasso that allows estimating multiple GGMs jointly.
Like the graphical lasso, FGL includes a penalty on density, regulated by
the tuning parameter λ1. Unlike the graphical lasso, the FGL also
includes a penalty on differences among corresponding edge weights in
networks computed in different samples, regulated by a tuning parameter
λ2. Large values of λ2 yield very similar networks
in which edges are estimated by exploiting all samples together; small
values allow network estimates to differ; and λ2 of zero means
that networks are estimated independently. Because it is unknown whether the
true networks are similar or different, a principled way of choosing both
λ1 and λ2 is through k-fold
cross-validation. Overall, FGL improves network estimates by exploiting
similarities among groups. If this does not improve model fit, the
k-fold cross-validation procedure selects a value of
the λ2 parameter equal or very close to zero, in which case
separate GGMs are estimated via the graphical lasso. As a result of this
strategy, the FGL neither masks differences nor inflates similarities across
groups. The FGL has been used successfully to compute gene expression
networks in cancer and healthy samples (Danaher et al., 2014 (link)), to estimate
networks of situational experience in different counries (
2017
networks in patients and healthy individuals (
Panfilis, 2017
2017
In this article, we estimated networks in the four samples using FGL and
selected optimal values of λ1 and λ2 parameters via
k-fold cross-validation, as implemented in the R
package EstimateGroupNetwork (Costantini & Epskamp, 2017 ).
Because FGL yields generally better network estimates (Danaher et al., 2014 (link)), we report
this joint estimation as the main model in the article. However, because
networks in the literature have been typically estimated using graphical
lasso, the
estimating networks individually. Additionally, we report the results of a
different method for selecting the tuning parameters for FGL via information
criteria instead of cross-validation. Both methods led to nearly identical
results to those reported here.