Study population. The Northern California Childhood Leukemia Study is a case–control study of childhood leukemia conducted in the San Francisco Bay area and California Central Valley that seeks to identify genetic and environmental risk factors for childhood leukemia. Cases 0–14 years of age were ascertained from pediatric clinical centers; controls, matched to cases on date of birth, sex, race, and Hispanic ethnicity, were selected from the California birth registry (California Department of Public Health, Sacramento, CA). Residential dust samples were collected from study homes as one strategy for assessing relevant environmental exposures. Case and control participants who were enrolled in the study from December 1999 through November 2007 were eligible for initial residential-dust collection if they were 0–7 years old and lived in the same home they had occupied at the time of diagnosis (or a similar reference date for controls). Subsequently, in 2010, participants in the initial dust collection were eligible for a second dust collection if they were still living in the same home. Among 629 participants in the initial dust collection, 225 were eligible for a second dust collection and 204 participated in the second dust collection. We successfully analyzed two dust samples for PAHs in 201 homes and successfully analyzed only the second dust sample for PAHs in three homes. For an additional 89 participants in the initial dust collection who were ineligible for the second dust collection, we also analyzed one dust sample for PAHs, as described below. We obtained written informed consent from the children’s parents and study protocols were approved by the institutional review board at the University of California, Berkeley.
Collection of residential dust. During the first round of dust sampling (2001–2007), we collected vacuum cleaner dust and administered a questionnaire during an in-home visit. During the second round of dust sampling (2010), we interviewed participants via telephone and instructed them to mail their vacuum cleaner bags (or the contents of their vacuum cleaner canisters) to the study center in prepaid parcels. The median interval between repeated sample collections was 4.8 years (range, 2.6–8.6 years). We stored dust samples away from heat (≤ 4
oC) and light before chemical analysis. We previously analyzed the dust samples from the first round of dust collection for nine PAHs (Whitehead et al. 2009 (
link)); however, for consistency, the dust samples from the first round of dust collection were reextracted and reanalyzed alongside the samples from the second round of dust collection according to the protocol described below.
Laboratory analysis of PAHs. We homogenized and fractionated the dust samples using a mechanical sieve shaker equipped with a 100-mesh sieve to obtain dust particles < 150 μm. Portions of fine dust (0.2 g) were spiked with an internal standard (50 ng of d
12-benzo[
a]pyrene), extracted via accelerated solvent extraction, purified by silica-gel column chromatography and gel permeation chromatography, concentrated to 250 μL, solvent exchanged into tetradecane, and spiked with a recovery standard (50 ng of d
10-pyrene). Finally we analyzed 12 PAHs (phenanthrene, anthracene, fluoranthene, pyrene, benzo[
a]anthracene, chrysene, benzo[
b]fluoranthene, benzo[
k]fluoranthene, benzo[
a]pyrene, indeno[
1,2,3-c,d]pyrene, dibenzo[
a,h]anthracene, and benzo[
g,h,i]perylene) using gas chromatography–mass spectrometry in the multiple ion detection mode. The chromatographic separation used a DB-5 column (60 m, 0.25 mm i.d., 0.25 μm film) that was programmed from 150
oC to 250
oC at 25
oC per minute, and then from 250
oC to 315
oC at 2.5
oC per minute. We analyzed a six-point calibration curve (range, 20–62,500 ng/mL) at the beginning and the end of sample analysis and a single point standard with each sample set. The analytical protocol was validated using replicate dust samples of National Institute of Standards and Technology (NIST) Standard Reference Material (SRM) 2585 (NIST, Gaithersburg, MD). For all validation replicates, measured concentrations of each PAH were generally within 30% of the NIST certified value (maximum error of 55%), and the sum of the PAHs was within 5% of the sum of the NIST certified values.
Quality control samples. We analyzed samples in batches of 12, with each batch consisting of 8 samples, 1 method blank, 1 duplicate sample pair (i.e., two 200-mg portions of fine dust taken from the same vacuum cleaner), and 1 interbatch quality control sample (i.e., a 200-mg portion of fine dust taken from the quality-control vacuum cleaner). Because we prepared and analyzed an interbatch quality control replicate alongside each successive sample batch, the interbatch quality control results illustrate the reproducibility of the dust preparation and analytical methods over the course of the study. Likewise, the duplicate samples illustrate the reproducibility of the dust preparation and analytical methods within each sample batch. For some batches, we replaced the interbatch quality control sample with the SRM 2585 dust sample. The SRM 2585 dust was vigorously homogenized, so results obtained from any 200-mg replicate should be highly reproducible. To demonstrate the optimal reproducibility of our method, we analyzed three pairs of duplicate SRM 2585 dust samples concurrently. To compare the magnitude of variability observed in the four types of quality control samples, we calculated the relative percent difference (RPD) between matched samples [for details regarding RPD calculations, see Supplemental Material, “Quality control samples” (http://dx.doi.org/10.1289/ehp.1205821)].
Questionnaire responses. Parents initially responded to structured in-home interviews designed to ascertain information relevant to childhood leukemia. Subsequently, households participating in the second dust collection (
n = 204) completed an additional telephone questionnaire designed to ascertain information about sources of residential chemical exposures. The latter questionnaire covered topics related to sources of indoor PAHs, including cigarette smoking, appliances, cooking practices, and shoe removal habits, as well as residential characteristics such as residential construction date, type, and square footage [see Supplemental Material, “Questions used to create variables for mixed-effects models” (http://dx.doi.org/10.1289/ehp.1205821)].
Geographic information. We used a global positioning device to determine the latitude and longitude for each residence and classified each residence as belonging to one of six geographic regions (
Figure 1). We estimated ambient air PAH concentrations at a census tract resolution using results from the U.S. Environmental Protection Agency (EPA) 2005 National-Scale Air Toxics Assessment (U.S. EPA 2011). The U.S. EPA assessment employed a National Emissions Inventory (U.S. EPA 2012) to estimate ambient air concentrations of 16 PAHs (including the 12 PAHs measured in this study, as well as acenaphthene, acenaphthylene, fluorine, and naphthalene) attributable to emissions from major stationary sources (e.g., power plants), area sources (e.g., commercial buildings), and mobile sources (e.g., automobiles). To distinguish between traffic emissions and emissions from other urban PAH sources, we considered ambient concentrations of PAH attributable to mobile sources and ambient concentrations of PAH attributable to area sources as two independent determinants of PAH levels in residential dust. Since the association between ambient PAH estimates and residential-dust PAH concentrations was nonlinear, we used the rank order of these census tract–level estimates for all regression analyses.
Random-effects models. To apportion the observed variance in PAH concentrations into four components describing regional variability, intraregional between-household variability, within-household variability over time, and within-sample analytical variability we used a hierarchical random-effects model,
Yhijk = ln(
Xhijk)
= µ
Y + bh + bhi + bhij + ehijk, [1]
for
h = 1,2,…,6 regions;
i = 1,2,…,294 households (i.e., 293 study residences and the interbatch quality control residence);
j = sampling round 1 or 2; and
k = 1,2…,40 replicate samples from the same vacuum bag, where
Xhijk = the residential-dust PAH concentration for the
ith household in the
hth region, from the
kth subsample of the
jth repeated measurement;
Yhijk = the natural log-transform of
Xhijk; μ
Y = the true (logged) mean residential-dust PAH concentration for the population;
bh = μ
Yh–μ
Y, and represents the random deviation of the
hth region’s true mean (logged) residential-dust PAH concentration, μ
Yh, from μ
Y;
bhi = μ
Yhi–μ
Yh, and represents the random deviation of the
ith household’s true mean (logged) residential-dust PAH concentration, μ
Yhi, from μ
Yh;
bhij = μ
Yhij–μ
Yhi, and represents the random deviation of the
jth measurement’s true mean (logged) residential-dust PAH concentration, μ
Yhij, from μ
Yhi;
ehijk =
Yhijk–μ
Yhij, and represents the random deviation of the observed (logged) residential-dust PAH concentration,
Yhijk, from μ
Yhij for the
ith household in the
hth region on the
jth repeated measurement.
We assume
bh,
bhi,
bhij, and
ehijk are mutually independent and normally distributed random variables, with means of zero and variances of σ
2BR, σ
2BH, σ
2WH, and σ
2WS, representing the between-region variability, the intraregional between-household variability, the within-household variability over time, and the within-sample analytical variability, respectively. Using the spatial analyst function in ArcGIS (ESRI, Redlands, CA), we estimated Moran’s
I statistic of spatial autocorrelation and confirmed that household-level random effects from model 1 were independent. Using PROC MIXED (version 9.1; SAS Institute Inc., Cary, NC), we fit the model described in Equation 1 and estimated variance components (σ
2BR, σ
2BH, σ
2WH, σ
2WS, σ
2Total = σ
2BR + σ
2BH + σ
2WH + σ
2WS) and variance ratios [λ = (σ
2WH + σ
2WS)/(σ
2BR + σ
2BH)]. As previously described (Whitehead et al. 2012 (
link)), for each PAH, we used the magnitude of the variance ratio to estimate the potential impact of measurement error on an odds ratio (
ORTrue = 2.0) for a hypothetical case–control study that employs a single dust sample to assess exposure to PAHs (
ORBiased = exp [ln(
ORTrue)/(1 + λ)].
To assess the impact of unequal within-household variance in case and control homes on variance ratio estimates, we used a second random-effects model (model 2) to apportion variance into three components for between-household variability (in all homes), within-household variability in case homes, and within-household variability in control homes, as described in detail in Supplemental Material, “Random-effects Model 2” (
http://dx.doi.org/10.1289/ehp.1205821).
Mixed-effects models. Complete model specifications for the mixed-effects models are provided in Supplemental Material, pp. 3–6 (
http://dx.doi.org/10.1289/ehp.1205821). In brief, we used mixed-effects models to identify sources of variability for each hierarchical level. In addition to the model 1 random effects, we included two fixed effects for neighborhood-level covariates in model 3: the rank order of estimated ambient concentrations of PAH attributable to emissions from area sources, and the rank order of estimated ambient concentrations of PAH attributable to emissions from mobile sources for the census tracts in the study. Likewise, in addition to the model 1 random effects, we included seven fixed effects for residential covariates in model 4: regular smoking inside or outside of the residence, residence construction date, residence is apartment or condominium, regular shoe removal by residents in home, < 25% of residence is carpeted, residence square footage is < 1,750 ft
2, and residence has at least two forms of combustion-based heating (i.e., gas or kerosene heat, fireplace, wood-burning stove, or steam radiator). Similarly, in addition to the model 1 random effects, we included two fixed effects for temporal covariates in model 5: the date of dust collection and the sequence of the laboratory analysis. In the fully saturated model 6, we included the random effects from model 1 as well as neighborhood, residential, and temporal covariates from models 3–5.
We fit each of the above mixed-effects models (models 3–6) for 451 observations with covariate data (i.e., 405 samples collected from 204 homes during repeat sampling rounds and 46 duplicate samples) and excluded the 139 observations without covariate data (i.e., 40 interbatch quality control replicates and 89 samples with 10 duplicates collected during round 1). For comparison, we re-ran the random-effects model (model 1) using this set of 451 observations. In a stratified analysis we fit model 6 using the case and control data separately to evaluate whether the fixed-effects estimates differed by case–control status.
Time trends in PAH concentrations may have differed by region. Model 7 includes a unique fixed effect for the time trend in PAH for each region in addition to the random and fixed effects in model 6.
To evaluate the influence of the time interval between repeat dust collections on within-household variability, model 8 includes four random effects representing the between-household variability (in all homes) and the within-household variability for households with various time intervals between sample collections (i.e., < 4 years, 4–6 years, ≥ 6 years) in addition to the fixed effects used in model 6.
Data imputation. We determined method reporting limits (MRL) for each PAH on a batch-by-batch basis according to the contamination measured in the method blank (i.e., MRL = 3 × mass of PAH in the method blank). We replaced each value below the MRL (
Table 1) with five imputations randomly selected from a log-normal distribution describing the PAH concentrations. The imputation procedure was restricted so that all replacement values were below the MRL. Additionally, some participants were unable or unwilling to complete all aspects of the questionnaires, and we also replaced missing covariate data using multiple imputation (e.g., for each of nine respondents who did not know their residence’s date of construction, we imputed five replacement dates). Ultimately, we created five complete data sets with a different imputed value for each missing value, performed regression analyses separately on each data set, and combined the results to produce confidence intervals that reflected the uncertainty created by the missing values. Moreover, we were unable to pinpoint six residences using the global positioning system, so we approximated their location using postal codes and replaced each missing census tract–level ambient PAH estimate with the corresponding county-level ambient PAH estimate.
To evaluate the impact of the multiple imputation procedure on estimates of variance components and fixed effects, for each PAH we fit model 1 using only the observations above the limit of detection; also, we fit model 6 using
a) only the observations above the limit of detection,
b) only the observations with complete covariate data, and
c) only the observations above the limit of detection with complete covariate data. For most PAHs, the estimated variance components were similar in the limited and full analyses; however, for phenanthrene (and to a lesser extent anthracene and fluoranthene), the within-sample analytical variability was smaller in the limited analysis (data not shown). The fixed effects produced in the limited and full analyses were qualitatively similar [i.e., each fixed effect that was significant (
p-value < 0.05) in the full analyses retained the same direction in the limited analyses with only minor changes in magnitude and significance observed (data not shown)].
Whitehead T.P., Metayer C., Petreas M., Does M., Buffler P.A, & Rappaport S.M. (2013). Polycyclic Aromatic Hydrocarbons in Residential Dust: Sources of Variability. Environmental Health Perspectives, 121(5), 543-550.