Automated Variant Filtering Pipeline for Developmental Disorders
An automated variant filtering pipeline was used to narrow down the number of putative diagnostic variants (figure 2). First, common (>1% minor allele frequency) and non-functional (not protein-altering) variants were filtered out. Second, potentially pathogenic variants in known disease genes were selected in by comparison against an in-house database of genes consistently implicated in specific developmental disorders, the Developmental Disorders Genotype-to-Phenotype database (DDG2P). This database includes more than 1000 genes that have been consistently implicated in specific developmental disorders and is updated regularly with newly implicated genes (table 1; appendix 1; appendix 2). Each gene in DDG2P is associated with a specific developmental phenotype or syndrome via a particular genetic mechanism (autosomal dominant, autosomal recessive, or X-linked) and mutation consequence on the gene product (loss of function, activating mutation, increased gene dosage, etc). The use of DDG2P enabled any rare variant in a known DD gene with a predictable effect on the gene product to be flagged on the basis of inheritance, genotype, and likely mutational consequence. Large, rare CNVs overlapping non-DDG2P genes were also flagged based on a series of size thresholds (>100 kb for losses and >250 kb for gains where the inheritance was either de novo or segregated with disease, and >500 kb for any genic CNV for which the inheritance was unclear).
Variant filtering logic for clinical reporting within the study
Genomic variants were filtered on the basis of six factors, of which the first five were automated and the final one was done manually: (1) frequency, prevalence of the variant in the general population (MAF ≤1%); (2) function, most severe predicted functional consequence, such as LOF, defined by specific sequence ontology terms (transcript ablation, splice donor variant, splice acceptor variant, stop-gained, frameshift variant, stop-lost, initiator codon variant, in-frame insertion, in-frame deletion, missense variant, transcript amplification, and coding sequence variant); (3) location, genomic location compared with DDG2P of published genes; (4) variant type, genotype (eg, heterozygous or homozygous) and loss or gain for small CNVs (which were only considered when they contained entire genes in which LOF or dominant negative mutations had been previously reported, and gains were only considered when they overlapped genes in which increased gene dosage mutations had been previously reported); (5) inheritance, aspects of the pipeline that are dependent on inheritance information derived from parental data are shaded; and (6) phenotype, patient phenotype was manually compared against published phenotypes for a particular gene. MAF=minor allele frequency. CNV=copy number variant. LOF=loss of function. DDG2P=Developmental Disorders Genotype-to-Phenotype database.
Changes to DDG2P over time
July, 2012
November, 2012
July, 2013
November, 2013
Total reportable genes*
819
875
1075
1128
Genes added
..
60
201
60
Genes removed†
..
4
1
7
In addition to genes being added or removed, annotations for existing genes can also change (eg, to include multiple modes or mechanisms). The November, 2013 version was used for the analysis presented here and includes 1128 reportable genes. DDG2P=Developmental Disorders Genotype-to-Phenotype database.
DDG2P also contains non-reportable categories when there is insufficient evidence associating a gene and developmental disorder (appendix 1).
The selection of variants for reporting is based on the strongest available evidence of gene function and no variants yet reported have been retracted because of changes in the DDG2P list.
Wright C.F., Fitzgerald T.W., Jones W.D., Clayton S., McRae J.F., van Kogelenberg M., King D.A., Ambridge K., Barrett D.M., Bayzetinova T., Bevan A.P., Bragin E., Chatzimichali E.A., Gribble S., Jones P., Krishnappa N., Mason L.E., Miller R., Morley K.I., Parthiban V., Prigmore E., Rajan D., Sifrim A., Swaminathan G.J., Tivey A.R., Middleton A., Parker M., Carter N.P., Barrett J.C., Hurles M.E., FitzPatrick D.R, & Firth H.V. (2015). Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet (London, England), 385(9975), 1305-1314.
Other organizations :
Wellcome Sanger Institute, University of Melbourne, King's College London, University of Oxford, University of Edinburgh, Cambridge University Hospitals NHS Foundation Trust, Addenbrooke's Hospital
Function (most severe predicted functional consequence, such as LOF, defined by specific sequence ontology terms)
Location (genomic location compared with DDG2P of published genes)
Variant Type (genotype, loss or gain for small CNVs)
Inheritance (aspects of the pipeline that are dependent on inheritance information derived from parental data)
dependent variables
Not explicitly mentioned
control variables
Positive control: The use of DDG2P, which includes more than 1000 genes that have been consistently implicated in specific developmental disorders and is updated regularly with newly implicated genes.
Negative control: The filtering out of common (>1% minor allele frequency) and non-functional (not protein-altering) variants.
Annotations
Based on most similar protocols
Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.
As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.
About PubCompare
Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.
We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.
However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.
Ready to
get started?
Sign up for free.
Registration takes 20 seconds.
Available from any computer
No download required