Rare CNV Burden Analysis

CNV burden was compared between cases and controls for rare CNVs (<1%), using CNV length excluding gaps and regions annotated as segmental duplications (hg18). The distribution of these CNVs is indicated in Supplementary Figure 6. Burden was defined using only the largest CNV to account for the large number of bases encompassed in small CNVs and the significant difference in array resolutions between cases and controls. Statistical comparisons utilized the Peto & Peto modification of the Gehan-Wilcoxon test (due to non-proportional hazard ratios) to assess overall burden. For significance at specific thresholds we utilized the Fisher's exact test. Significance for CNV enrichment was enumerated for all RefSeq genes (NCBI36). All isoforms for each gene were combined into a single entry representing all possible coding bases. Rare CNVs from cases and all control CNVs were then enumerated for only cases where the CNV intersects an exon. The resulting counts were then compared using the one-tailed Fisher's exact test. Likelihood ratios were calculated as per standard formulae, and confidence bounds were estimated by using the binomial confidence interval for case and control counts calculated by the Clopper–Pearson exact tail area method as described in Rosenfeld et al^{59 (link)}. Additionally, we calculated an empirical p-value for genes affected by rare CNVs. To do so we first excluded CNVs residing in regions with elevated mutation rates or unreliable CNV detection. These regions include subtelomeric CNVs initiating in the first 1.5 Mbp of each chromosome, over 75% of bases intersecting hotspots (145.1 Mbp across 58 sites) and segmental duplications (130.4 Mbp across 7,264 sites), initiating or terminating in a centromere gap region. All CNVs under 10 Mbp were then randomly shuffled (chromosome selection was weighted by the number of bases not filtered) under these constraints for cases and controls and Fisher's exact tests were calculated for deletions and duplications of each gene 20,000 times. The empirical p-value was defined as the number of simulations more significant than observed plus one divided by the number of simulations plus one. CNV burden for regions was also enumerated using a windowed analysis of rare case CNVs over 250 kbp. Window starts/ends were defined based on all unique breakpoints in the signature array. Breakpoint pairs under 50 kbp were then filtered as these represent the uncertainty in edges of Signature calls. Counts for p-values are based on 40% coverage of each window by cases (over 250 kbp) or controls (all CNVS). Significance was calculated using the one-tailed Fisher's exact test, and Supplementary Figure 2 shows the negative logarithm of the p-value. In many cases the critical region may represent multiple subregions that individually reach significance. Here, we report the larger region where smaller subregions are indicated by a number of additional CNVs over the background preventing refinement to a single candidate gene. Due to high prior probability of pathogenicity for large CNVs, the lack of independence between genes disrupted by CNVs, and the high odds ratio for most pathogenic loci, we have chosen to report nominal significance in all cases in addition to the Benjamini-Hochberg q-value, which represents an overestimate of the false discovery rate in our analyses⁶⁰. Please see the Supplementary Note for details on our interpretation of q-values in this study.

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Coe B.P., Witherspoon K., Rosenfeld J.A., van Bon B.W., Vulto-van Silfhout A.T., Bosco P., Friend K.L., Baker C., Buono S., Vissers L.E., Schuurs-Hoeijmakers J.H., Hoischen A., Pfundt R., Krumm N., Carvill G.L., Li D., Amaral D., Brown N., Lockhart P.J., Scheffer I.E., Alberti A., Shaw M., Pettinato R., Tervo R., de Leeuw N., Reijnders M.R., Torchia B.S., Peeters H., O'Roak B.J., Fichera M., Hehir-Kwa J.Y., Shendure J., Mefford H.C., Haan E., Gécz J., de Vries B.B., Romano C, & Eichler E.E. (2014). Refining analyses of copy number variation identifies specific genes associated with developmental delay. Nature genetics, 46(10), 1063-1071.

Publication 2014

Centromere Chromosome Chromosome 5 Deletions Exon Gene background Gene duplications Genes Isoforms Mbp 1 Pathogenic Pathogenicity Segmental duplications

Corresponding Organization :

Other organizations : University of Washington, PerkinElmer (United States), South Australia Pathology, Radboud University Nijmegen, Radboud University Medical Center, Oasi Maria SS, Istituti di Ricovero e Cura a Carattere Scientifico, Autism Research Institute, University of California, Davis, Barwon Health, Royal Children's Hospital, Murdoch Children's Research Institute, Florey Institute of Neuroscience and Mental Health, Austin Health, Mayo Clinic, KU Leuven, University of Adelaide, Howard Hughes Medical Institute

Top 5 similar protocols

Protocol cited in 44 other protocols

Variable analysis

independent variables

CNV burden was compared between cases and controls for rare CNVs (<1%), using CNV length excluding gaps and regions annotated as segmental duplications (hg18).

dependent variables

The distribution of these CNVs is indicated in Supplementary Figure 6. Burden was defined using only the largest CNV to account for the large number of bases encompassed in small CNVs and the significant difference in array resolutions between cases and controls.

control variables

Not explicitly mentioned.

controls

Not specified.
Not specified.

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!