We have previously found that the positive predictive value for some algorithms to establish a diagnosis from EMR data is improved by requiring the presence of multiple instances of disease-associated ICD9 codes44 (link). For example, to be considered a case for tuberculosis, a patient is required to have at least two ICD9 codes in the ranges of 10–18 (tuberculosis infections of different sites), 137 (late effects of tuberculosis) or V12.01 (personal history of tuberculosis). Accordingly, for the present study, we used a threshold of relevant ICD9 codes on two distinct days to establish that person as a “case” for a given phenotype. Controls are patients without any ICD9 codes in the corresponding control range; thus, patients with a single ICD9 case code are excluded for the analysis as neither a case nor a control. Each SNP-phenotype association test was run independently with PLINK43 (link), using logistic regression adjusted for age, gender, site (e.g., Vanderbilt, Marshfield Clinic), and the first three principal components as calculated by EIGENSTRAT, using ancestry informative markers as above41 (link). Analysis was performed assuming an additive genetic model. These data were aggregated and analyzed using Perl scripts and the R statistical package.
Genome-Wide Association Study of European Americans
Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link:
Access Free Full Text.
Corresponding Organization : Vanderbilt University
Other organizations : Pennsylvania State University, Kaiser Permanente Washington Health Research Institute, Marshfield Clinic, Northwestern University, University of Washington, Mayo Clinic, The University of Texas Health Science Center at Houston, National Human Genome Research Institute, Center for Human Genetics, Essentia Health, University of Washington Medical Center
Protocol cited in 52 other protocols
Variable analysis
- Genetic variants (SNPs)
- Clinical phenotypes (1,358 possible phenotypes belonging to >25 patients)
- Gender
- Site (e.g., Vanderbilt, Marshfield Clinic)
- First three principal components calculated by EIGENSTRAT using ancestry informative markers
- Patients with at least two ICD9 codes in the ranges of 10–18 (tuberculosis infections of different sites), 137 (late effects of tuberculosis) or V12.01 (personal history of tuberculosis) are considered as 'cases' for tuberculosis
- Patients without any ICD9 codes in the corresponding control range are considered as 'controls'
Annotations
Based on most similar protocols
As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.
About PubCompare
Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.
We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.
However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.
Ready to get started?
Sign up for free.
Registration takes 20 seconds.
Available from any computer
No download required
Revolutionizing how scientists
search and build protocols!