Sequencing reads were mapped by BWA with a maximum of six mismatches and no gap46 (link). Amplicons with the same tag were collected to generate a read cluster. Since each read cluster was originated from the same template, true mutations were called only if the mutations occurred in 90% of the reads within a read cluster. We acknowledged that this error-correction approach would only correct errors that occured during the deep sequencing process but not those that were introduced during the reverse transcription process. Read clusters with a size below three reads were filtered out. Read clusters were further conflated into “error-free” reads. Average coverages in terms of “error-free” reads were 177028 per nucleotide in the plasmid mutant library, 112355 per nucleotide in replicate 1 of passaged viral mutant library, and 161773 per nucleotide in replicate 2 of passaged viral mutant library (Fig. S1A). Relative fitness index (RF index) for individual point mutations was computed by: For all the downstream analysis, only point mutations covered with ≥30 tag-conflated reads (“error-free” reads) in the plasmid library were included. This arbitrary cutoff filtered out mutants with low statistical confidence, which is ~16% of all possible point mutations (Fig. S1B). In addition, all C → A and G → T mutations are not included in the reported dataset due to an observed DNA oxidative damage during library preparation47 (link). The RF index presented in Table S1 was calculated by averaging all RF indices available for a given amino acid substitution.
Free full text: Click here