We analyzed the risks of newly incident conditions, defined as new documentation of the above mentioned PASC categories in the follow-up period that were not present in the baseline period. Specifically, we compared adjusted hazard ratios (aHR) and adjusted excess burdens of these events occurring 31–180 days after the index date between the SARS-CoV-2 positive group and negative group. For each potential PASC condition, aHR was estimated by a Cox proportional hazard model, and excess burden was defined as the difference in cumulative incidence per 1,000 patients in the positive group and negative group over the follow-up period. For example, an excess burden of 40 for symptom X indicates there were 40 more people per 1,000 with symptom X after COVID-19 infection compared with people not infected with COVID-19. We estimated cumulative incidence by the Aalen-Johansen model11 considering death to be a competing risk for target outcomes. We adjusted for a wide range of baseline covariates by stabilized inverse propensity score re-weighting.12 (link) The standardized mean difference (SMD) was used to quantify the goodness-of-balance of covariates after reweighting. We considered SMD < 0.1 as being balanced in terms of each covariate and required all covariates to be balanced after re-weighting. Both the aHR and excess burden calculations used the same covariates for adjustment.
Baseline covariates included age, gender, race, and ethnicity. The national-level area deprivation index (ADI) was used to assess socioeconomic disadvantage of patients.13 (link) We imputed a missing ADI value with median ADI per site. Healthcare utilization was measured as the number of inpatients, outpatient, and emergency encounters (0, 1–2, 3–4, 5 or more visits for each encounter type). The Body Mass Index (BMI) was categorized according to WHO guidelines. We adopted a tailored list of the Elixhauser comorbidities and related drug categories (e.g., corticosteroid and immunosuppressant prescriptions) to capture comorbidities.14 Patients were defined as having comorbidity if they had at least two corresponding diagnoses documented during the baseline period.
We reported PASC conditions if they had: adjusted hazard ratio > 1; P-value <3.6 × 10−4 (corrected by Bonferroni method to control for false discovery) in multiple test settings; and at least 100 patients with the condition. We reported adjusted hazard ratios with a 95% confidence interval. We used Python 3.9, python package lifelines-0.2666 for survival analysis and scikit-learn-0.2318 for machine learning models. Code is available at