We used ANNOVAR [29 (link)] to annotate the VCF files from the 200,643 WES samples. The Genome Aggregation Database (gnomAD) [30 (link)] were used to retrieve variant frequencies from the general population. We focused on rare PV for hereditary CRC (Lynch syndrome, polyposis) and considered the same variant filtering approach that was used in a recent study aiming at selecting rare PV [31 (link)]. The following inclusion criteria were used: (1) only APC, MUTYH, MLH1, MSH2, MSH6, PMS2 variants in protein-coding regions were included since PV in other genes associated with hereditary CRC are too rare or even absent in the study population; (2) allele frequency (AF) < 0.005 in at least one ethnic subpopulation of gnomAD; (3) not annotated as “synonymous,” “non-frameshift deletion” and “non-frameshift insertion”; (4) annotated as “pathogenic” or “likely pathogenic” based on ClinVar [32 (link)]. We did not include MUTYH in the pooled analysis since no biallelic (i.e. high penetrance) case was identified in the cohort; however, we included the heterozygous (monoallelic) carriers in the single gene analysis to compare the effect size with the other genes.
Free full text: Click here