Study population. We conducted a population-based case–control study of NHL in four National Cancer Institute–Surveillance Epidemiology and End Results Program (NCI-SEER) study sites (http://seer.cancer.gov/). The study design has been previously described (Colt et al. 2004 (link); Wheeler et al. 2011 (link)). Briefly, the study was conducted in Iowa, Los Angeles County, California, and the metropolitan areas of Detroit, Michigan (Macomb, Oakland, and Wayne counties) and Seattle, Washington (King and Snohomish counties). Eligible cases were 20–74 years of age, diagnosed with a first primary NHL between July 1998 and June 2000, and uninfected with HIV. In Seattle and Iowa, all consecutive cases were chosen. In Detroit and Los Angeles, all African-American cases and a random sample of white (regardless of Hispanic ethnicity) cases were eligible for study, allowing for oversampling of African-American cases. Of the 2,248 potentially eligible cases, 320 (14%) died before they could be interviewed, 127 (6%) were not located, 16 (1%) had moved away, and 57 (3%) had physician refusals. Of the 1,728 remaining cases, 1,321 (76%) participated. Controls (≥ 65 years of age) were selected from Center for Medicare and Medicaid Services files (http://dnav.cms.gov/) or the general population using random digit dialing (< 65 years of age) and were frequency matched to cases by sex, age (within 5-year groups), race, and study site. Of the 2,409 potentially eligible controls, 2,046 were able to be located and contacted, and 1,057 (52%) of these subjects participated. The study was approved by the human subjects review boards at all participating institutions. Written informed consent was obtained from each participant.
Computer-assisted personal interviews were conducted in the home of each participant. Interviewers asked about demographics including race and education, age of the home, housing type, the presence of oriental rugs, pesticide use in the home and garden, residential and occupational histories, and other factors.
Dust samples and laboratory analysis. As described in detail previously (Colt et al. 2004 (link), 2005 (link)), dust was collected between February 1999 and May 2001 from vacuum cleaners of participants who gave permission (93% of cases, 95% of controls) and who had used their vacuum cleaner within the past year and owned at least half their carpets or rugs for ≥ 5 years [695 cases (57%), 521 controls (52%)]. Dust samples from 682 cases (98%) and 513 controls (98%) were successfully analyzed between September 1999 and September 2001.
Exposure to a mixture of 27 chemicals measured in house dust [5 PCBs, 7 polycyclic aromatic hydrocarbons (PAHs), and 15 pesticides] was of interest. The PCBs were congeners 105, 138, 153, 170, and 180. The PAHs were benz(a)anthracene, benzo(a)pyrene, benzo(b)fluoranthene, benzo(k)fluoranthene, chrysene, dibenz(ah)anthracene, and indeno(1,2,3-cd)pyrene. The pesticides were α-chlordane, γ-chlordane, carbaryl, chlorpyrifos, cis-permethrin, trans-permethrin, 2,4-dichlorophenoxyacetic acid (2,4-D), DDE, dichlorodiphenyltrichloroethane (DDT), diazinon, dicamba, methoxychlor, o-phenylphenol, pentachlorophenol, and propoxur. Extraction and analysis were performed on 2-g aliquots of dust samples using gas chromatography/mass spectrometry (GC/MS) in selected ion monitoring mode. Concentrations were quantified using the internal standard method. Usual detection limits were 20.8 ng/g of dust for α-chlordane, γ-chlordane, DDE, DDT, propoxur, o-phenylphenol, PAHs, and PCBs; 42–84 ng/g for chlorpyrifos, diazinon, cis-permethrin, dicamba, pentachlorophenol, and 2,4-D; and 121–123 ng/g for carbaryl and trans-permethrin. Changes in analytic procedures during the study resulted in increased detection limits for methoxychlor (from 20.7 to 62.5 ng/g). A small proportion of samples weighing < 2 g had detection limits that were higher than the usual detection limits.
The laboratory measurements for the 27 analytes contained various types of ‘‘missing data,’’ primarily when the concentration was below the minimum detection level. To a lesser extent, missing data occurred when there was co-elution between the target chemical and interfering compounds. Chemical concentrations were assumed to follow a log-normal distribution, and data were imputed using a “fill-in” approach to create 10 complete data sets for each of the 27 analytes. Details about the imputation of analyte values have been published previously (Colt et al. 2004 (link); Lubin et al. 2004 (link)).
A total of 1,180 subjects with complete dust analysis results and covariate values were included in this analysis. The sample included 508 (43%) controls and 672 (57%) cases, and was predominantly white (88%) with an average age of 60 years (SD = 11.2). Of these 1,180 subjects, 202 (17%) were from the Detroit study site, 340 (29%) from Iowa, 292 (25%) from Los Angeles, and 346 (29%) from Seattle.
Statistical analysis. In previous analyses of individual chemicals in the study population overall, we evaluated NHL risk comparing tertiles or other groupings of levels above the detection limit to those with no detectable level of the chemical (Colt et al. 2005 (link), 2006 (link); Hartge et al. 2005 (link)). Study site–specific risk estimates were not presented in these publications. Here, we used a weighted quartile sum approach in conjunction with nonlinear logistic regression to evaluate the effect of several chemical exposures together on the risk of NHL. Exposure to a mixture of 27 chemicals measured in house dust was evaluated overall and in study site–specific models. All models were adjusted for sex, age at diagnosis (cases)/selection date (controls), race, and level of education. Age was treated as continuous, race was dichotomized as white or non-white, and education was treated as ordinal (grouped as < 12, 12–15, and ≥ 16 years). In the overall model, we also adjusted for study site.
The WQS method (Carrico et al. 2014 (link)) is constrained to have associations in the same direction for chemical exposures and risk, and is designed for variable selection over prediction. WQS regression estimates a weighted linear index in which the weights are empirically determined through the use of bootstrap sampling. The approach considers data with c correlated components scored as ordinal variables into quantiles (here, quartiles) that are reasonable to combine (i.e., all chemicals) into an index and potentially have a common adverse outcome. The weights are constrained to sum to 1 and be between 0 and 1, thereby reducing dimensionality and addressing issues associated with collinearity. For this analysis, the c = 27 chemical concentrations were scored into quartiles based on the case and control data combined and denoted by qi, where qi = 0, 1, 2, or 3 for i = 1 to c. A total of B = 100 bootstrap samples (of the same size as the total sample, n = 1,180) were generated from the full data set and used to estimate the unknown weights, w, that maximized the likelihood for b = 1 to B for the following model
subject to the constraints cΣi=1wi|b  = 1 and 0 ≤ wi ≤ 1 for i = 1 to c. In the above equation, wi represents the weight for the ith chemical component qi, and the term cΣi=1wiqi represents a weighted index for the set of c chemicals of interest. Furthermore, z denotes a vector of covariates determined prior to estimation of the weights, φ are the coefficients for the covariates in z, and g(.) is any monotonic and differentiable link function that relates the mean, μ, to the predictor variables in the right hand side of the equation. Because the outcome variable of interest in this analysis is binary (case status), a logit link was assumed for g.
For each bootstrap sample, the p-value of β1, the parameter estimate for the weighted index, was used to evaluate the statistical significance of the estimated vector of weights (α = 0.10). The weighted quantile score was then estimated as
and nB is the number of bootstrap samples in which β1 was significant. Finally, the significance of the WQS index was determined using the original data set and the model
g(μ) = β0 + β1 WQS + z´φ, [2]
where exp(β1) is the odds ratio (OR) associated with a unit (quartile) increase in the weighted sum of exposure quartiles (WQS index).
Weights estimated from the full data set were used to create a WQS index denoted as WQSF. In addition to WQSF, four site-specific indices [denoted as WQSD (Detroit), WQSI (Iowa), WQSL (Los Angeles), and WQSS (Seattle)] were estimated using data from each site. Differences in the distributions of the chemical concentrations across sites prohibited the use of quantiles based on the full data set in the estimation of site-specific weights; therefore, we used site-specific quartiles based on the combined case–control distribution to estimate site-specific indices. The association between the WQS indices and NHL was examined by testing each index within its respective data set, with statistical significance set at α = 0.05. The primary statistical analysis was performed using one randomly selected imputation data set. A secondary analysis estimated WQS indices for all 10 imputed data sets to assess sensitivity of the results to the data imputation.
We conducted further analyses of major subtypes of NHL: diffuse large B-cell lymphoma (DLBCL), follicular lymphoma, small lymphocytic lymphoma/chronic lymphocytic leukemia (SLL/CLL), marginal zone lymphomas, other lymphomas, and lymphomas where subtype was not specified/unknown [not otherwise specified (NOS)]. Our study primarily included SLL rather than CLL (Morton et al. 2008 (link)). Other lymphomas consisted of mantle cell lymphoma, lymphoplasmacytic lymphoma, Burkitt lymphoma/leukemia, mycosis fungoides/Sézary syndrome, and peripheral T-cell lymphoma. We fitted WQS regression models separately for each of these groups to determine whether the mixture effect varied by subtype using all 508 controls in each model.
As a comparison to the WQS regression results, we also conducted single chemical analyses (one-by-one) for all of the data (adjusted for study site) and separately within each study site using study site–specific cut points based on the distributions among cases and controls combined. Models were adjusted for sex, age, race, and level of education. ORs comparing each of the three highest quartiles to the first quartile of exposure were estimated for each individual chemical. Given the exploratory nature of these analyses, no adjustments were made for multiple comparisons.
Free full text: Click here