Raw variants called by Strelka2 in WGS data were also the basis for the mutational signature analysis. Since normal pairs were not available, we applied a series of filters to approximate a somatic callset: we filtered out the variants with a population frequency (AF_popmax) higher than 1%, called in more than one cell line, with a variant allele frequency (VAF) lower than 0.1 and, variants in highly variable genes (MUC3A, MUC5AC, OR52E5, OR52L1, SMPD1, PRAMEF and LILR). We also filtered out the variants in dbSNP except for those present in COSMIC and ICGC. We used this call set enriched in somatic variants with the mutSignatures68 (link) R package to estimate the contribution of each of the thirty COSMIC mutational signatures to the mutational profile of each cell line.
Free full text: Click here