Data was analyzed on a per lesion basis and was modeled in terms of a factorial diagnostic trial involving the factors “scoring system” (PI-RADS v2.0 versus PI-RADS v2.1) and “reader”. The area under the ROC curve (AUC) of each reader and scoring system combination assessed the diagnostic accuracy. Since data are measured on an ordinal scale, we applied the nonparametric ANOVA-type statistic17 (link) to test differences between the AUC of the two scoring systems and the three readers as well as interactions between readers and scoring systems18 (link). Proportions of cancerous lesions per score were compared using the two-proportions Z-Test. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value were calculated after dichotomizing the PI-RADS assessment categories using a predefined cut-off value: a PI-RADS category of ≥ 3 was defined as positive. Sensitivity and specificity of the two versions were compared using the McNemar’s test. PPV and NPV were compared using the test by Lange and Brunner19 (link).
Results were declared to be significant if p < 5%. Statistical analysis was performed by F.K. and T.P. using R version 1.1.419 (www.r-project.org) and SAS version 9.4.
Free full text: Click here