Our goal was to develop two equations for estimating GFR: one using serum cystatin C (hereafter referred to as the cystatin C equation) and another using both serum cystatin C and serum creatinine (hereafter referred to as the creatinine–cystatin C equation). As in our previous work, we prespecified a process for developing and validating equations (described in the Methods section in the Supplementary Appendix). In brief, we used least-squares linear regression to relate logarithm-transformed measured GFR to log serum creatinine, log serum cystatin C, age, and sex. We also used nonparametric smoothing splines to characterize the shape of the relationship of log measured GFR with log creatinine and log cystatin C and then approximated the smoothing splines by means of piecewise linear splines to represent observed nonlinearity. Other candidate variables included the other filtration marker, black race, diabetes status, and weight. The significance threshold for inclusion was P<0.01 for these additional variables and P<0.001 for pairwise interactions among variables. Models that showed improved performance with the use of prespecified criteria were evaluated in the internal validation data set for verification of the statistical significance of predictor variables and interactions. Development and internal-validation data sets were combined into one data set (hereafter referred to as the development data set) to derive final coefficients.
In the external-validation data set, we compared the new equations with each other, with our previous creatinine equation,3 (link) and with our prior equations involving cystatin C that were developed in populations of patients with chronic kidney disease and reexpressed for standardized cystatin C values10 (link),11 (link) (Table S3 in the Supplementary Appendix), as well as with the average of the CKD-EPI creatinine equation and the new cystatin C equation. We compared the performance of equations in the overall data set and in the subgroups, and final models were selected according to the ranking of the root-mean-square error overall and within subgroups, clinically significant differences, and ease of application in clinical practice.