Detailed MetaGenomics Data Processing Protocol

The three datasets were processed by the read_trim_filter step in MOCAT with length cut off set to 30 and quality cut off set to 20, using solexaqa for the mock community and the simulated metagenome, and fastx for the 124 gut metagenomes.
Estimated taxonomic compositions for the simulated metagenome and the mock community were calculated in three steps. First, quality trimmed and filtered reads from the mock community were screened against a FASTA-file with Illumina adapter sequences (Table S5), using the screen_fastafile option and e-value set to 0.01. Second, screened reads from the mock community and quality trimmed and filtered reads from the simulated metagenome were mapped and filtered against the custom-made reference databases with chromosome and plasmid sequences from the 22 mock genomes (Table S4) and 100 genomes from the simulated metagenome (Table S2 in [13] (link) and Table S3), respectively. This was done by executing the screen and filter commands with length cutoff set to 30, percentage identity set to 90 and paired_end_filtering set to yes for the simulated metagenome and set to no for the mock community. Finally, the taxonomic composition was estimated using the calculate_coverage command.
Assembly and gene prediction, on the simulated metagenome and mock community, were performed using the assembly (SOAPdenovo version 1.06) and gene_prediction (MetaGeneMark) options. Quality trimmed and filtered reads from the simulated metagenome, and adapter-screened reads from the mock community, were assembled into scaftigs 60 bp or longer. Predicted complete genes were aligned to their respective metagenomes using blastall v2.2.26 [26] (link) (program blastn, 95% sequence identity, alignment length > = 90%, and e-value 0.1) and only the best hit selected.
The 124 human gut microbiomes were processed with and without 5′ trimming. 5′ trimmed reads were assembled using SOAPdenovo 1.05, using both the Kmer determined by MOCAT and a fixed Kmer size set to 23. These assemblies were revised using SOAPdenovo 1.06 using the assembly_revision options, and genes were predicted, with MetaGeneMark as selected software, on scaftigs from both assemblies and revised assemblies. The non 5′ trimmed and 5′ trimmed reads were mapped to the assembled scaftigs using the screen option using length cutoff 30 and quality cutoff 15.
Complete commands for processing the simulated metagenome and mock community in MOCAT are bundled with the installation of the pipeline.

Free full text: Click here

Kultima J.R., Sunagawa S., Li J., Chen W., Chen H., Mende D.R., Arumugam M., Pan Q., Liu B., Qin J., Wang J, & Bork P. (2012). MOCAT: A Metagenomics Assembly and Gene Prediction Toolkit. PLoS ONE, 7(10), e47656.

Publication 2012

Chromosome Genes Genomes Gut microbiomes Human Human microbiomes Metagenome Plasmid

Corresponding Organization : Max Delbrück Center

Other organizations : BGI Group (China), South China University of Technology, Bioscience (China)

Top 5 similar protocols

Protocol cited in 45 other protocols

Variable analysis

independent variables

Length cut off set to 30
Quality cut off set to 20
E-value set to 0.01
Percentage identity set to 90
Paired_end_filtering set to yes for the simulated metagenome and no for the mock community
Scaftigs 60 bp or longer
95% sequence identity
Alignment length >= 90%
E-value 0.1
Kmer determined by MOCAT and a fixed Kmer size set to 23
Length cutoff 30
Quality cutoff 15

dependent variables

Taxonomic compositions for the simulated metagenome and the mock community
Assembly and gene prediction on the simulated metagenome and mock community
Assembly and gene prediction on the 124 human gut microbiomes

control variables

Solexaqa for the mock community and the simulated metagenome
Fastx for the 124 gut metagenomes
Custom-made reference databases with chromosome and plasmid sequences from the 22 mock genomes and 100 genomes from the simulated metagenome

positive controls

Mock community

negative controls

Not explicitly mentioned

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!