To demonstrate the utility of standardizing disparate data sources into a CDM, we replicated a published observational study protocol and evaluated the quality of a standardized approach and time-to-execution. As an exemplar, we used the Mini-Sentinel analysis of the comparative effectiveness of rivaroxaban versus warfarin on various outcomes in patients with atrial fibrillation.30 We developed a standardized analytic routine that replicated the cohort definitions within the protocol and applied the analytic program across all 6 databases to compare the impact of the inclusion criteria on the proportion of patients qualifying for the study.
Specifically, we identified all new users of each target drug (warfarin and rivaroxaban) who satisfied the following 7 criteria of the original study: (1) had at least 183 days of nonexposure before the first target drug exposure; (2) had at least 1 atrial fibrillation or atrial flutter diagnosis code within the 183-day window prior to first exposure; (3) did not have any prior diagnosis or procedure codes indicative of long-term dialysis; (4) did not have any prior diagnosis or procedure codes indicative of kidney transplant; (5) did not have any prior diagnosis or procedure code indicative of mitral stenosis or mechanical heart valve; (6) did not have any prior procedure code indicative of joint replacement or arthroplasty surgery; and (7) did not have prior use of any anticoagulant (warfarin, rivaroxaban, dabigatran, or apixaban). For each target drug, we created 2 cohorts: new users of the drug (defined by satisfying criteria No. 1), and the subset of those new users of the drug who satisfied the remaining 6 criteria. For each cohort, we produced a standardized descriptive summary of the population, including demographics (gender and age distribution), comorbidities (prevalence of conditions in time window prior to cohort entry), concomitant medications (prevalence of drug exposure in time window prior to cohort entry), and service utilization (prevalence of procedures in time window prior to cohort entry). We measured the execution time for the standardized analytic routine when applied to each target drug across all 6 databases. Analyses were conducted on a Microsoft Server 2008 (Microsoft Corporation, Redmond, Washington) with an AMD Opteron 6172 (Advanced Micro Devices, Inc, Sunnyvale, California), 2.10 GHz, 2 processors, 24-core CPU, and 256 GB of RAM. Each CDM was stored in a separate database within an instance of Microsoft SQL Server 2012 (Microsoft Corporation, Redmond, Washington).
Appendix 1 contains the standard concepts and corresponding source codes that were used to define each of the core concepts required within the prespecified protocol.
Specifically, we identified all new users of each target drug (warfarin and rivaroxaban) who satisfied the following 7 criteria of the original study: (1) had at least 183 days of nonexposure before the first target drug exposure; (2) had at least 1 atrial fibrillation or atrial flutter diagnosis code within the 183-day window prior to first exposure; (3) did not have any prior diagnosis or procedure codes indicative of long-term dialysis; (4) did not have any prior diagnosis or procedure codes indicative of kidney transplant; (5) did not have any prior diagnosis or procedure code indicative of mitral stenosis or mechanical heart valve; (6) did not have any prior procedure code indicative of joint replacement or arthroplasty surgery; and (7) did not have prior use of any anticoagulant (warfarin, rivaroxaban, dabigatran, or apixaban). For each target drug, we created 2 cohorts: new users of the drug (defined by satisfying criteria No. 1), and the subset of those new users of the drug who satisfied the remaining 6 criteria. For each cohort, we produced a standardized descriptive summary of the population, including demographics (gender and age distribution), comorbidities (prevalence of conditions in time window prior to cohort entry), concomitant medications (prevalence of drug exposure in time window prior to cohort entry), and service utilization (prevalence of procedures in time window prior to cohort entry). We measured the execution time for the standardized analytic routine when applied to each target drug across all 6 databases. Analyses were conducted on a Microsoft Server 2008 (Microsoft Corporation, Redmond, Washington) with an AMD Opteron 6172 (Advanced Micro Devices, Inc, Sunnyvale, California), 2.10 GHz, 2 processors, 24-core CPU, and 256 GB of RAM. Each CDM was stored in a separate database within an instance of Microsoft SQL Server 2012 (Microsoft Corporation, Redmond, Washington).