We first estimated risk-adjusted hospital mortality rates with all three procedures during 2003–04. We defined mortality as death within 30 days of operation or prior to hospital discharge. We use this definition because the 30-day cut-off is somewhat arbitrary, and when a death occurs in the hospital after major elective surgery it is almost certainly attributable to the operation itself or complications from the surgery. We adjusted for patient age, gender, race, urgency of operation, median ZIP-code income, and coexisting medical conditions. Coexisting medical conditions were obtained from secondary diagnoses in the claims data using the methods of Elixhauser (Southern, Quan, and Ghali 2004 (link)). Using logistic regression, we estimated the expected number of deaths in each hospital and then divided the observed deaths by this expected number of deaths to obtain the ratio of observed to expected mortality (O/E ratio). We then multiplied the O/E ratio by the average mortality rate to obtain a risk-adjusted mortality rate for each hospital.
We next used hierarchical modeling techniques to adjust these mortality estimates for reliablity (See Technical Appendix for details). Using random effects logistic regression models, we generated empirical Bayes predictions of mortality for each hospital (Morris 1983 ; Normand, Glickman, and Gatsonis 1997 ). This technique shrinks the point estimate of mortality back towards the average mortality rate, with the amount of shrinkage proportional to the reliability at each hospital. Reliability is a measure of precision and is a function of both hospital sample size (which determines “noise” variation) and the amount of true variation across hospitals (“signal”). For example, for hospitals with low caseloads of a particular procedure, mortality rates have lower reliability and are shrunk more towards the average mortality. For hospitals with high caseloads, mortality rates are more reliable and shrunk less towards the average mortality. The resulting reliability adjusted mortality is considered the best estimate of a hospital’s “true” mortality rate with each operation (Morris 1983 ).
An underlying assumption of reliability adjustment is that hospitals provide average performance until the data are sufficiently robust to prove otherwise. For example, consider a hospital performing 10 pancreatic resections in a year with 2 deaths (observed mortality rate of 20%). Because of the small number of cases, there is considerable likelihood that this estimate of 20% is the result of chance and not truly an indication of bad performance. From the empirical Bayes perspective, the true mortality rate lies somewhere between this observed rate of 20% and the population-based rate of 5% (the average mortality rate across all hospitals). Using reliability adjustment, the observed rate of 20% is “shrunk” back toward the average rate of 5%. The degree of shrinkage is proportional to the reliability with which the mortality rate is measured. The more reliable the observed mortality rate, the more weight it is afforded. Reliability is assessed on a scale of 0 to 1, with 1 representing perfect reliability. In this case, suppose the reliability based on 20 cases is 0.15, and the remaining weight (0.85) is placed on the average mortality. Thus, the reliability adjusted mortality for this hospital is (0.20)(0.15) + (0.05)(1−0.15) = 7.2%. To further illustrate the impact of reliability adjustment, Figure 1 shows mortality rates before and after reliability adjustment for 20 randomly selected hospitals for each of the 3 procedures in this study. After reliability-adjustment, there is a much less variation across hospitals, as the most extreme observations are shrunk back towards the average mortality rate.