Following the proof-of-principle study, exomes of case children with one of nine other birth defects and their parents were selected for ES at NISC using the optimized methods (i.e., dry-brush-derived gDNA and a low input library preparation protocol). The additional case groups included the following: anterior segment dysgenesis eye defects, primary congenital glaucoma, transverse limb reduction defects, split hand/foot malformation, cloacal exstrophy, bladder exstrophy, anophthalmos or microphthalmos, sacral agenesis, and biliary atresia. UW-CMG sequenced exomes of selected CHD trios (tricuspid atresia, Ebstein anomaly, hypoplastic left heart syndrome, and heterotaxy with and without CHDs) using dry-brush-derived gDNA and a low input library preparation protocol optimized in their laboratory (ThruPLEX DNA-seq Kit, Rubicon Genomics, Ann Arbor, MI).
As in the proof-of-principle study, BAM files from each case group sequenced at NISC were transferred to UWCMG so that each case group was processed separately using the same pipeline (details inAppendix ). UW-CMG used peddy [Pedersen & Quinlan, 2017 ] to check sex, ancestry (using Principal Components Analysis), and pedigrees/relationships and annotated the variant call format files with the ENSEMBL Variant Effect Predictor (v89; McLaren et al., 2016 ). A summary variant report was prepared for each case group. This report included a list of genes identified using variant filtration in GEMINI (Paila et al., 2013 ) under each mode of inheritance (homozygous recessive, compound heterozygous, de novo, X-linked recessive, and X-linked de novo) except autosomal dominant, in multiple families and for each family. In addition, UWCMG provided a report for each case group describing copy number variants (CNVs) identified using CoNIFER (Krumm et al., 2012 ). Upon project completion, these data will be shared broadly in public repositories; as examples, aggregate variant and broad phenotype data will be shared through dbGaP and Geno2MP (Chong et al., 2015 ); likely pathogenic and pathogenic variants through ClinVar (Landrum et al., 2014 ); and candidate genes with the MatchMaker Exchange (Philippakis et al., 2015 ) via MyGene2 (Chong et al., 2016 ); and the CMG website.
When enough specimens are available per defect, rare variant association testing, such as burden and kernel-based testing, will be conducted within ancestry groups. Rare variants will be validated by Sanger sequencing and potentially included in functional studies. Additionally, the rich environmental exposure data collected from NBDPS participants can be mined to assess exposures that might modify genetic effects, although small numbers limit the robustness of such an assessment for some defects. We plan to publish all results, including negative findings, as for some of these phenotypes, these might be the only current exome-sequenced cohorts with numbers large enough to conduct these analyses, making the results important to include in the peer-reviewed literature.
As in the proof-of-principle study, BAM files from each case group sequenced at NISC were transferred to UWCMG so that each case group was processed separately using the same pipeline (details in
When enough specimens are available per defect, rare variant association testing, such as burden and kernel-based testing, will be conducted within ancestry groups. Rare variants will be validated by Sanger sequencing and potentially included in functional studies. Additionally, the rich environmental exposure data collected from NBDPS participants can be mined to assess exposures that might modify genetic effects, although small numbers limit the robustness of such an assessment for some defects. We plan to publish all results, including negative findings, as for some of these phenotypes, these might be the only current exome-sequenced cohorts with numbers large enough to conduct these analyses, making the results important to include in the peer-reviewed literature.