A hypergeometric test was performed to assess whether genes relevant to nsCPO pathobiology (n = 25) preferentially achieved lower
p-values (< 0.05 nominal
p-values) in GCA compared to the rest of protein-coding genes. The list of genes relevant to nsCPO pathobiology was obtained by intersecting multiple lists derived from Online Mendelian Inheritance in Man (OMIM), Human Phenotype Ontology, Gene-Ontology, Genecard, and Malacard databases. OMIM and GenCards were searched by Boolean string: “cleft palate” OR “bifid uvula” OR “cleft uvula”; specific reference IDs were used in the other cases: HumanPhenotypeOntology HP:0000175, GeneOntology GO:0060021 and GO:1905748, MalaCards CLF027. The list of genes that are nominally enriched in ultra-rare variants in the nsCPO cohort was obtained by GCA to evaluate the per-gene rare variant burden in cases versus controls. In GCA, per-gene variant counts were performed in the whole cohort using an in-house Perl script. We assigned a binary variable to each subject based on absence/presence, respectively, of any number of variants per subject. The number of cases and controls with at least one variant was used to assess enrichment for ultra-rare variants in either group using Fisher’s exact test. The nominal significance level, to identify genes nominally enriched in ultra-rare variants in cases compared to controls, was 0.05.
Statistical tests were performed in R v 3.5.1. Clinical significance of ultra-rare variants was assessed by manual curation of their list according to the American College for Medical Genetics (ACMG) standards and guidelines [23 (
link)], restricted to genes that had orofacial clefts in their OMIM clinical synopsis. Moreover, we investigated whether there was a significant difference in the number of ultra-rare homozygous variants due to reported consanguinity in families of Iranian nsCPO cases compared to Italian cases.
Segregation analysis in informative families was performed, when possible, by variant-site targeted PCR and Sanger sequencing, only for variants identified in genes with the nominally significant
p-values from GCA.