To determine if there was an exaggerated treatment effect at the CI’s site, the outcome data for participants recruited at this site for each trial were averaged, as were the data for participants from the remaining sites. For both trials, the primary outcome was the Oxford Shoulder Score (OSS),17 a shoulder-specific patient-reported outcome measure (total scores of 0 (worst outcome) to 48 (best outcome)), and therefore was the outcome used for analysis in this study at the one-year follow-up. The target difference between treatment groups for the two trials was set at a threshold of five or four points when testing for differences between surgery and non-surgical options or between surgical options, respectively. Data were analyzed by forest plot. To test for the presence of an early bias in treatment effect over time, the mean outcome for the quintiles of randomized patients was calculated and analyzed by forest plot. For both, a fixed effects model was used and the I2 value to determine heterogeneity. As UK FROST had three arms, separate analyses were carried out to compare all treatments. Review Manager (RevMan) 5 was used to undertake these analyses. This was repeated for the first five sites open compared with the remaining sites.
To examine the presence of selection bias, we explored whether there were differences in age or predictors of poor outcome between the patients who were randomized and those who either did not consent or were ineligible to take part. For age, the mean and standard deviation (SD) were calculated for the trial participants and for patients who were ineligible, eligible but did not consent, and the latter groups combined. For the ProFHER trial, the predictor of poor outcome was whether either tuberosity (a rounded prominence) of the humeral bone was involved in the fracture;15 (link) for UK FROST it was diabetic status.16 (link) The percentage of individuals who had tuberosity involved or were diabetic, for the respective trials, was calculated for the following groups: trial participants, ineligible patients, eligible but non-consenting patients, and the latter groups combined. To assess whether these changed over time, the participants were ordered by randomization date and split into quintiles (i.e. five equal groups). Each group was analyzed as above. The non-consenting and ineligible patients were combined and ordered by date of eligibility so that the quintiles matched the same time periods as the recruited group.