The GBD Cause of Death Ensemble model (CODEm) systematically tested and combined results from different statistical models according to their out-of-sample predictive validity. Results are incorporated into a weighted ensemble model as detailed in appendix 1 (section 3.1) and below. For GBD 2017, CODEm was used to estimate 192 causes of death (appendix 1 section 7). To predict the level for each cause of death, we used CODEm to systematically test a large number of functional forms and permutations of covariates.18 (link) Each resulting model that met the predetermined requirements for regression coefficient significance and direction was fit on 70% of the data, holding out 30% for cross-validation (appendix 1 section 3.1). Out-of-sample predictive validity of these models was assessed by use of repeated cross-validation tests on the first 15% of the held-out data. Various ensemble models with different weighting parameters were created from the combination of these models, with the highest weights assigned to models with the best out-of-sample prediction error for trends and levels, as detailed in appendix 1 (section 7). Model performance of these ensembles was assessed against the root-mean squared error (RMSE) of the ensemble model predictions of the log of the age-specific death rates for a cause, assessed with the same 15% of the data. The ensemble model performing best was subsequently selected and assessed against the other 15% of the data withheld from the statistical model building. CODEm was run independently by sex for each cause of death. A separate model was run for countries with 4-star or greater VR systems to avert uncertainty inflation from more heterogeneous data. The distribution of RMSE relative to cause-specific mortality rates (CSMRs) at Level 2 of the GBD hierarchy shows that model performance was weakest for causes of death with comparatively low mortality rates (figure 2; appendix 2), while models for more common causes of death such as stroke, chronic obstructive pulmonary disease, and self-harm and interpersonal violence generally had low RMSE.
Out-of-sample model performance for CODEm models and age-standardised cause-specific mortality rate by Level 1 causes
Model performance was defined by the root-mean squared error of the ensemble model predictions of the log of the age-specific death rates for a cause with 15% of the data held out from the statistical model building. The figure shows the association between the root-mean squared error and the log of the CSMR, aggregated over 1980–2017. Each point represents one CODEm model specific for model-specific age ranges and sex. Circles denote models run with all locations. Triangles denote models run on only data-rich locations. Colours denote the Level 1 cause categories. Open circles and triangles denote models that were run with restricted age groups of less than 30 years. CODEm=Cause of Death Ensemble model. CSMR=cause-specific mortality rate.
Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980–2017: a systematic analysis for the Global Burden of Disease Study 2017. (2018). Lancet (London, England), 392(10159), 1736-1788.
Various functional forms and permutations of covariates used in the CODEm models
dependent variables
Level for each cause of death
control variables
Held-out 30% of the data for cross-validation
Separate models run for countries with 4-star or greater VR systems
Annotations
Based on most similar protocols
Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.
As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.
About PubCompare
Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.
We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.
However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.
Ready to
get started?
Sign up for free.
Registration takes 20 seconds.
Available from any computer
No download required