The goal of scale revision was to identify a comprehensive, reproducible, and valid set of scales measuring concerns relevant to long-term cancer survivorship, with each scale composed of a set of internally consistent items. To achieve this end, our strategy was 1) to extract scales that were based on the IOC questionnaire items by use of exploratory factor analysis (39 , 40 ); 2) to perform split-sample cross-validation to assess reproducibility of the scales across subsamples (40 ); and 3) to conduct psychometric evaluation to assess the construct and concurrent validity of the proposed scales (41 ).
Exploratory factor analyses were conducted by use of the FACTOR procedure in SAS version 9.1 software (SAS Institute, Inc., Cary, NC). To decrease the dependence of our findings on any particular factor analytic technique, we used three methods of factor extraction (principal components, maximum likelihood, and unweighted least squares) and two methods for selecting the number of factors [the Kaiser-Guttman criterion of retaining factors with eigenvalues greater than 1 (42 , 43 ) and Cattell scree plot technique (44 )] and retained only those items that had factor loadings of greater than 0.50 by all approaches and loaded on factors with a clear interpretation. After factor extraction, we conducted factor rotation, an algorithmic procedure that achieves simplified factor structure by optimizing the grouping of items with common characteristics onto common factors. Because factors were expected to be correlated, we used the oblique promax rotation procedure (45 ).
The reproducibility of factor structure across subsamples was assessed by use of the targeted rotation method of McCrae et al (46 ). This method tests the hypothesis that the factor structure represented in the first sample is replicated in the second sample by extracting the hypothesized number of factors from the second sample, performing a targeted rotation to align the axes in the second factor structure with the axes in the first factor structure (the target), and calculating coefficients of congruence that quantify the fit between the two factor structures. Congruence coefficients compare two sets of factor loadings (item–factor correlations) in terms of both the pattern and magnitude of the loadings and can range from +1 (perfect agreement) to –1 (perfect inverse agreement). The observed congruences are compared with critical values generated by use of Monte Carlo techniques to determine the statistical significance of the fit. We defined a statistically significant congruence as a congruence higher than 95% of congruences obtained by rotating the second factor structure to align with axes in randomly generated target factor structures. For this analysis, we used the SAS Interactive Matrix Language program provided as an appendix in McCrae et al (46 ).
Psychometric evaluation included computation of Cronbach's coefficient alpha statistic for each scale as a measure of internal consistency reliability (47 ). Scales are generally considered reliable if the alpha statistic exceeds 0.70 (48 ). We also computed the coefficient delta or delta statistic, an index of the ability of a scale to discriminate among individuals (49 (link)). The delta statistic can range from 0, corresponding to all respondents giving the same response, to 1, corresponding to a maximally discriminating scale in which responses are uniformly distributed across the range of possible values (49 (link), 50 (link)).
The validity of the scales was evaluated by use of several strategies. Face validity was evaluated by examining item content. Construct validity, including convergent and discriminant validity, was evaluated by examining the Pearson product-moment correlation coefficients (r) among the scale scores and patterns of relationships between the scale scores and the sociodemographic, medical, and treatment characteristics of the sample cross-sectionally. For the latter, scale scores were examined for differences, or lack thereof, across age, years since diagnosis, partnered status, breast-conserving surgery vs mastectomy, chemotherapy status, general health status, number of comorbidities, body mass index, adjuvant hormonal therapy use, and current antidepressant use for depression or anxiety. These analyses used correlation coefficients for continuous variables and analysis of variance (ANOVA) for categorical variables. Concurrent validity was evaluated by forming a priori hypotheses about patterns of association and correlating the scales scores with the CES-D scores and the BCPT symptom scale total and subscale scores. When evaluating the quantitative significance of correlations, we considered an |r| of less than 0.30 to indicate a negligible association, |r| between 0.30 and 0.45 to indicate a moderate association, |r| between 0.45 and 0.60 to indicate a substantial association, and |r| greater than 0.60 to indicate a strong association (51 (link)). In the validity analyses, we used a P value of less than .005 as the critical value for statistical significance to account for the large sample size and multiple comparisons. All P values and tests of statistical significance were two-sided.
We computed scores for both higher-order scales and subscales as the mean of non-missing items that composed the scale. Scores were considered missing if more than 50% of items were missing.