Four different assembly methods, exhaustive iterative assembly (Schurch et al., 2014 (link)), CLC Genomics Workbench 6.0.4 assembler (CLC bio, Aarhus, Denmark), Genovo version 0.4 (Laserson et al., 2011 (link)), and Newbler 2.5 (Roche), were compared in their efficiency of detecting viral reads in the three metagenome datasets. The originally used method was iterative exhaustive assembly. Iterative exhaustive assembly of sequences is part of a virus discovery pipeline written in the python programming language (Python 2.7) that includes trimming of reads and initial assembly with Newbler (454GS Assembler version 2.7, Roche), with standard parameters. Trimmed reads and initial contigs were subjected to assembly by CAP3 (VersionDate: 12/21/07) (Huang and Madan, 1999 (link)) with standard parameters. The resulting singletons and contigs were iteratively assembled by CAP3 until no new contigs were formed.
Subsequently, the trimmed reads were mapped back to the identified taxonomic units with Newbler (454 GSMapper version 2.7, Roche) with standard parameters (Schurch et al., 2014 (link)). CLC Genomics Workbench 6.0.4 assembler (CLC bio, Aarhus, Denmark) was run with the previously trimmed reads with automatic bubble and word size. Genovo version 0.4 was run with 40 iterations and otherwise default values (Laserson et al., 2011 (link)). Newbler 2.5 (Roche) was run with default values.
Free full text: Click here