Given the phrasing of each prognostic factor, in only one case was a factor described as protective (i.e. facilitate recovery): regular physical activity in the case of non-traumatic neck pain. The confidence in each association was categorized using an approach adapted from the GRADE working group [19 (link)]: High, moderate, low or very low confidence that the direction of association is robust to findings in future research. In an attempt to be conservative, high confidence was reserved for only those predictors for which consistent high-quality evidence was presented in each SR with at least 1 high quality SR and no conflicting SRs. Moderate confidence required consistent high-level findings from at least 1 recent medium-quality SR, with the majority of findings from other concurrent SRs (where applicable) in the same direction of effect. Low confidence was assigned to a predictor when summary findings were of low-moderate level from the majority of SRs with some conflicting results, or when only a single SR reported significant but moderate findings for that predictor. Very low confidence was assigned when none of the above conditions were met. As a result of these algorithms, each predictor received both an estimate of its association with outcome (risk of poor outcome, no effect on outcome, inconclusive effect) and a level of confidence in that association (high, moderate, low, very low). Readers will note that this means it was possible to arrive at a conclusion of being highly confident in an inconclusive result, which holds meaning for establishing research priorities but less so for clinical practice.
Most SRs did not attempt to stratify the prognostic ability of a variable by outcome. This is understandable considering that there is little to no consensus on the most appropriate outcome to measure in prognostic research on neck pain [20 (link)]. Further, Walton and colleagues [6 (link)] attempted to evaluate the magnitude of prognostic effect between symptom-related outcomes and disability-related outcomes using meta-analysis, and showed that the magnitude of the effect was similar in almost all cases, with older age being the only notable exception. However, two SRs did present their summarized results stratified by type of outcome [5 (link), 16 (link)]. In most cases the magnitude of association was consistent across outcomes, but where it differed, the magnitude entered into the database was the best representation of the overall reported magnitude. For example, if a predictor showed a strong association with one outcome and a limited association with another, the strength of the association for that predictor overall was described in the database as moderate. This happened in only 7 of the 239 different summary statements extracted, which are denoted in the supplementary tables.