In this study, we revised and expanded our earlier PheWAS phenotype categorization to a total of 1,645 phenotypes identified from International Classification of Disease, Ninth revision, Clinical Modification (ICD9) codes. (Our initial PheWAS phenotype categorization included 744 phenotypes9 (link).) The ICD9 coding system is divided into four components: diseases, signs and symptoms (“three digit” codes, 001–999), external causes of injury (“E” codes), procedures (“two digit” codes 00.0–99.9) and supplemental classifications (“V” codes). The prior PheWAS code groupings included only diseases, signs and symptoms (three digit) ICD9 codes9 (link). We revised and expanded the PheWAS phenotypes by (i) adding V codes (commonly used to record personal histories of given diseases) and E codes (which refer to external causes of injury) to the PheWAS code mapping, (ii) redesigning the code system to be hierarchical, such that one phenotype could be a parent of another subphenotype (e.g., cardiac arrhythmias is a parent of atrial fibrillation, atrial flutter and other arrhythmias), and (iii) including more granular phenotypes into the coding system (e.g., “type 1 diabetes with ketoacidosis”). Creation of hierarchical phenotypes included creation of phenotypes not present in the ICD9 billing hierarchy, such as “inflammatory bowel disease” as the parent phenotype for “Crohn’s disease” and “ulcerative colitis.” In this process, we were guided by the hierarchical organization of the Clinical Classifications Software (CCS) produced by the Agency for Healthcare Research and Quality42 (link); the 2011 version of the CCS contains 727 phenotypes. The resulting PheWAS code group currently contains 1,645 phenotypes, 1,358 of which had at least 25 cases (a prevalence of 0.18% in our data set) in the eMERGE cohort, our threshold for these analyses. The current version of the PheWAS codes, with ICD9 mappings and control groups, is available from http://knowledgemap.mc.vanderbilt.edu/research/content/phewas.