Notice on the previous plot that
Lactobacillales appears to be a taxonomic Order with bimodal abundance profile in the data. We can check for a taxonomic explanation of this pattern by plotting just that taxonomic subset of the data. For this, we subset with the
subset_taxa() function, and then specify a more precise taxonomic rank to the
Facet argument of the
plot_abundance function that we defined above.
psOrd= subset_taxa(ps3ra, Order=="Lactobacillales")plot_abundance(psOrd,Facet="Genus",Color=NULL)At this stage in the workflow, after converting raw reads to interpretable species abundances, and after filtering and transforming these abundances to focus attention on scientifically meaningful quantities, we are in a position to consider more careful statistical analysis. R is an ideal environment for performing these analyses, as it has an active community of package developers building simple interfaces to sophisticated techniques. As a variety of methods are available, there is no need to commit to any rigid analysis strategy a priori. Further, the ability to easily call packages without reimplementing methods frees researchers to iterate rapidly through alternative analysis ideas. The advantage of performing this full workflow in R is that this transition from bioinformatics to statistics is effortless.
We back these claims by illustrating several analyses on the mouse data prepared above. We experiment with several flavors of exploratory ordination before shifting to more formal testing and modeling, explaining the settings in which the different points of view are most appropriate. Finally, we provide example analyses of multitable data, using a study in which both metabolomic and microbial abundance measurements were collected on the same samples, to demonstrate that the general workflow presented here can be adapted to the multitable setting.
.cran_packages<- c("knitr","phyloseqGraphTest","phyloseq","shiny", "miniUI","caret","pls","e1071","ggplot2","randomForest", "vegan","plyr","dplyr","ggrepel","nlme", "reshape2",
"devtools","PMA","structSSI","ade4", "igraph","ggnetwork","intergraph","scales").github_packages<- c("jfukuyama/phyloseqGraphTest").bioc_packages<- c("phyloseq","genefilter","impute")# Install CRAN packages (if not already installed).inst<- .cran_packages%in%installed.packages()if(any(!.
inst)){ install.packages(.cran_packages[!.inst],repos="http://cran.rstudio.com/")}.inst<- .github_packages%in% installed.packages()if(any(!.inst)){ devtools::install_github(.github_packages[!.inst])}.inst<- .bioc_packages%in% installed.packages()if(any(!.inst)){ source("http://bioconductor.org/biocLite.R") biocLite(.bioc_packages[!.inst])}
Free full text: Click here