Protocol detail

Validating PICRUSt's Metagenome Prediction Accuracy

Find Similar Protocols

Several microbiome studies that included both 16S sequencing and WGS metagenome sequencing for the same samples were used to test the accuracy of PICRUSt. These included 530 paired human microbiome samples^{22 (link)}, 39 paired mammal gut samples^{24 (link)}, 14 paired soil samples^{34 (link)}, 10 paired hypersaline microbial mats^{23 (link), 24 (link)} and two even/staggered synthetic mock communities from the HMP^{33 (link)}. We additionally used PICRUSt to make predictions on three 16S-only microbiome studies, specifically 6,431 HMP samples (http://hmpdacc.org/HMQCP), 993 vaginal time course samples⁴³ and 335 coral mucus samples(http://www.microbio.me/qiime/; Study ID 1854).
For 16S data, PICRUSt-compatible OTU tables were constructed using the closed-reference OTU picking protocol in QIIME 1.5.0-dev (pick_reference_otus_through_otu_table.py) against Greengenes+IMG using ‘uclust’^{48 (link)}. For paired metagenomes, WGS reads were annotated to KOs using v0.98 of HUMAnN^{30 (link)}. Expected KO counts for the HMP mock communities were obtained by multiplying the mixing proportions of community members by the annotated KO counts of their respective reference genomes in IMG. PICRUSt was used to predict the metagenomes using the 16S-based OTU tables, and predictions were compared to the annotated WGS metagenome across all KOs using Spearman rank correlation. In addition, KOs were mapped to KEGG Module abundances, following the conjugative normal form as implemented in HUMAnN script “pathab.py” for the HMP and vaginal datasets to compare modules and pathways. Bray-Curtis distances (for Beta-diversity comparison between OTU or PICRUSt KO abundances across samples) were calculated using as implemented in the QIIME “beta_diversity.py” script. The PCA plot and identification of KEGG modules with significant mean proportion differences for both the HMP and vaginal datasets was created using STAMP v2.0^{36 (link)}.
The Nearest Sequenced Taxon Index (NSTI) was developed as an evaluation measure describing the novelty of organisms within an OTU table with respect to previously sequenced genomes. For every OTU in a sample, the sum of branch lengths between that OTU in the Greengenes tree to the nearest tip in the tree with a sequenced genome is weighted by the relative abundance of that OTU. All OTU scores are then summed to give a single NSTI value per microbial community sample. PICRUSt calculates NSTI values for every sample in the given OTU table, and we compared NSTI scores and PICRUSt accuracies for all of the metagenome validation datasets.
In the metagenome rarefaction analysis (Fig. 4), a given number of counts were randomly selected from either the collection of microbial OTUs for each sample (i.e. the 16S rRNA OTU table) or the collection of sequenced genes in that sample using the multiple_rarefactions.py script in QIIME 1.5.0-dev^{29 (link)}. To estimate the number of raw reads at which PICRUSt outperforms metagenomic sequencing the annotated shotgun reads were transformed to total sequenced reads by dividing by the mean annotation rates from the original manuscript (17.3%), while 16S rRNA reads were transformed using the success rate for closed-reference OTU picking at a 97% 16S rRNA identity threshold (68.9%). Both the subsampled metagenome and the PICRUSt predictions from the subsampled OTU table were compared for accuracy using Spearman rank correlation versus the non-subsampled metagenome.

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Langille M.G., Zaneveld J., Caporaso J.G., McDonald D., Knights D., Reyes J.A., Clemente J.C., Burkepile D.E., Vega Thurber R.L., Knight R., Beiko R.G, & Huttenhower C. (2013). Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nature biotechnology, 31(9), 814-821.

Publication 2013

16s rrna Coral Genes Genome Human microbiome Mammal Metagenome Microbial community Microbiome Mucus Tree Vaginal

Top 5 similar protocols

Protocol cited in 1 640 other protocols

Variable analysis

independent variables

Microbiome studies with paired 16S sequencing and WGS metagenome sequencing data

dependent variables

Accuracy of PICRUSt predictions
Spearman rank correlation between PICRUSt predictions and annotated metagenomes
Bray-Curtis distances between OTU or PICRUSt KO abundances across samples
Differences in KEGG module abundances between PICRUSt predictions and annotated metagenomes

control variables

Closed-reference OTU picking protocol in QIIME 1.5.0-dev against Greengenes+IMG using 'uclust'
HUMAnN v0.98 for annotating WGS reads to KOs
Mapping KOs to KEGG Module abundances using the conjugative normal form as implemented in HUMAnN script 'pathab.py'
STAMP v2.0 for PCA and identification of significant KEGG module proportion differences
Nearest Sequenced Taxon Index (NSTI) for evaluating novelty of organisms in OTU tables

Annotations

Based on most similar protocols

Closed-reference OTU picking against Greengenes database is used to construct PICRUSt-compatible OTU tables from 16S data (Protocol 1, Protocol 2).

QIIME 1.5.0-dev is used for 16S data processing, including OTU picking and beta-diversity calculations (Protocol 1, Protocol 5).

HUMAnN v0.98 is used to annotate WGS reads to KOs for the paired metagenome samples (Protocol 1).

Expected KO counts for the HMP mock communities are obtained by multiplying the mixing proportions of community members by the annotated KO counts of their respective reference genomes in IMG (Protocol 1).

PICRUSt is used to predict metagenomes from the 16S-based OTU tables, and the predictions are compared to the annotated WGS metagenomes using Spearman rank correlation (Protocol 1).

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!