Sequences from virus-size samples (0.1 μm to 50 kDa) derived from 454 metagenomic sequence libraries were assembled together using a Newbler assembler (Roche). Prior to assembly, sequences of low complexity were identified and removed using DUST (82 (link)). The sequences were first assembled using Newbler with default parameters, including the minimum identity (–mi) set at 86 and the –rip option, which outputs each read into only one contig. Following the initial assembly, downsampling or bioinformatics normalization was performed as previously described by Allen et al. (83 (link)). Briefly, areas of the genome with high coverage were randomly reduced to within 2 standard deviations (SDs) of the average contig coverage. These methods were implemented to increase assembly metrics, i.e., fewer contigs, greater length, and greater N50 scores.
Free full text: Click here