Fitness of an allele (wi) was calculated from the enrichment of the synonyms of the wild-type gene ( ), the enrichment of allele i ( ) and the fold increase in the number of cells during the growth competition experiment (r) as described by Equation
We calculate the variance in the fitness as
where the frequency of allele (fi) is calculated from counts of that allele (ci) and the total sequencing counts (cT).
From the variance in fitness, we calculated a 99% confidence interval. Additionally, we calculated a P-value using a 2-tailed test. Details of the Z-score and P-value equations are available in Mehlhoff et al. (2020) (link).
We estimated the number of false positives that would be included at P < 0.01 and P < 0.001 significance in order to correct for multiple testing (Storey and Tibshirani 2003 (link)) in our DMS datasets as described previously (Mehlhoff et al. 2020 (link)). For TEM-1, we estimated that our data would contain approximately 55.0 false positives on average at P < 0.01 significance and an estimated 5.6 false positives on average at P < 0.001 significance for a single replica (Mehlhoff et al. 2020 (link)). Those values are 44.1 and 4.3 (CAT-I), 52.8 and 5.3 (NDM-1), and 33.8 and 3.4 (aadB) at P < 0.01 and P < 0.001 significance, respectively. We chose to report the frequency of mutations having fitness effects that met the P-value criteria in both replica experiments to limit the occurrence of false positives.