To make sequence data maximally comparable to reference data, we assigned taxonomy and filtered out non-plant OTUs from each table. To optimize the process, we processed all OTUs together. Centroids from all 20 tables (6 SWARM, 5 VSEARCH, 3 CROP, 1 DADA2, 5 DADA2+ VSEARCH) were pooled and dereplicated. The best GenBank matches for each OTU were acquired using BLASTn36 (link) (with settings -qcov_hsp_perc 90 -perc_identity 80), keeping up to 20 matches pr. OTU. For each OTU, all hits, from the best match and down to matches half a percent (0.5%) lower than the best, were retained, and the most commonly assigned taxonomic id was identified, and the taxonomic path (kingdom, phylum, class, order, family, genus, species) was acquired from the NCBI taxonomy. The ingroup OTUs were identified as belonging to Streptophyta, but excluding Chlorophyta, Sphagnopsida, Jungermanniopsida, Bryopsida, and Polytrichopsida. With the ingroup OTUs defined, the 20 OTU tables and centroid files were filtered to contain only ingroup OTUs.
Free full text: Click here