Primers and sequence adapters were removed with the Illumina MiSeq Reporter (version 2.5). The sequences were further processed using scripts implemented in the R statistical computing environment with the DADA2 (version 1.10.1) package (27 (link)) (scripts are available on GitHub at https://github.com/lakarstens/ControllingContaminants16S). Briefly, sequences were quality filtered and trimmed (forward reads at 230 nucleotides [nt] and reverse reads to 210 nt) prior to inferring amplicon sequence variants (ASVs) with the DADA2 algorithm. ASVs, which group similar sequences together according to a model that considers sequence abundance and sequencing error, were chosen over traditional operational taxonomic units (OTUs) since they have a finer resolution (28 (link)– (link)30 (link)). Chimeric sequences were removed with the approach implemented in the DADA2 package. Taxonomy was assigned for each ASV to the genus level using the RDP Naive Bayesian Classifier (31 (link)) implemented in DADA2 with the SILVA database (version 132). The R package phyloseq (version 1.26.1) (32 (link)) was used for storing the ASV table, taxonomy, and associated sample data and for calculating alpha-diversity measures. Expected values for alpha-diversity measures were calculated on the subset of the mock microbial community dilution samples that only contained expected sequences.
Free full text: Click here