In order to explore the geographic patterns of our two outcomes of interest, we used the spatial scan statistic [15 ] to detect and evaluate the statistical significance of any geographic clusters of each outcome. This method imposes a very large number of overlapping circles of different location and size on the map, each of which is a potential cluster, and adjusts for the multiple testing inherent in the many circles considered.
Our cluster detection method identified clusters of both high and low rates, with a maximum scanning window size to include up to 50% of the population at risk. Secondary clusters were reported if they had no geographic overlap with more likely clusters. P-values were derived from 999 simulated Monte Carlo replications under the null hypothesis of spatial randomness of outcomes of interest.
We conducted three separate cluster detection analyses for each of the two prostate cancer outcomes: higher histologic grade of tumor, and later stage at diagnosis. In the unadjusted analysis, under the null hypothesis, the expected number of more aggressive grade or late stage cases in a block group was calculated by multiplying the total case population of the block group by the statewide rate of the outcome of interest. Thus, in the unadjusted analysis, a block group would be expected to have the same rate or proportion of late stage or high grade cases in its case population as the State. In the two adjusted analyses, the expected number of aggressive grade or later stage cases was calculated from a regression model containing individual case characteristics, or from a regression model with both individual and area-level covariates. Based on the expected counts, the number of aggressive grade and later stage cases in each block group was modeled as a Poisson distribution.
For the unadjusted analyses, we also used a Bernoulli model to compare the distribution of so-called "cases" (those with aggressive grade or late stage) to "controls" (less aggressive grade or early stage) based on point location of each residential address, rather than rates within block groups. This was useful to compare the sensitivity of the Poisson model assumption for aggregated data to that of the unaggregated Bernoulli method. No major differences in results were found, and to allow proper comparison between the adjusted and unadjusted analyses, the Poisson model results are presented for all three types of analyses.
For each cluster identified, we list the radius, number of block groups in the cluster, the observed versus expected number of late stage or aggressive grade cases, the relative risk and the p value. The relative risk is the risk of the respective outcome within the cluster, compared to the population's risk. We report clusters with statistical significance p < .05 that do not overlap with another reported cluster with a lower p-value. Calculations were done using the freely available SaTScan v4.0 software .
Our cluster detection method identified clusters of both high and low rates, with a maximum scanning window size to include up to 50% of the population at risk. Secondary clusters were reported if they had no geographic overlap with more likely clusters. P-values were derived from 999 simulated Monte Carlo replications under the null hypothesis of spatial randomness of outcomes of interest.
We conducted three separate cluster detection analyses for each of the two prostate cancer outcomes: higher histologic grade of tumor, and later stage at diagnosis. In the unadjusted analysis, under the null hypothesis, the expected number of more aggressive grade or late stage cases in a block group was calculated by multiplying the total case population of the block group by the statewide rate of the outcome of interest. Thus, in the unadjusted analysis, a block group would be expected to have the same rate or proportion of late stage or high grade cases in its case population as the State. In the two adjusted analyses, the expected number of aggressive grade or later stage cases was calculated from a regression model containing individual case characteristics, or from a regression model with both individual and area-level covariates. Based on the expected counts, the number of aggressive grade and later stage cases in each block group was modeled as a Poisson distribution.
For the unadjusted analyses, we also used a Bernoulli model to compare the distribution of so-called "cases" (those with aggressive grade or late stage) to "controls" (less aggressive grade or early stage) based on point location of each residential address, rather than rates within block groups. This was useful to compare the sensitivity of the Poisson model assumption for aggregated data to that of the unaggregated Bernoulli method. No major differences in results were found, and to allow proper comparison between the adjusted and unadjusted analyses, the Poisson model results are presented for all three types of analyses.
For each cluster identified, we list the radius, number of block groups in the cluster, the observed versus expected number of late stage or aggressive grade cases, the relative risk and the p value. The relative risk is the risk of the respective outcome within the cluster, compared to the population's risk. We report clusters with statistical significance p < .05 that do not overlap with another reported cluster with a lower p-value. Calculations were done using the freely available SaTScan v4.0 software