Four publicly available metagenomes were selected for evaluation: a metagenome sampled from acid mine drainage (Tyson et al., 2004 (
link), accession numbers AADL01000110.1-AADL01001068.1, CH003545.1-CH004435.1, DS995259.1-DS995275.1), one obtained from enhanced biological phospate removing (EBPR) sludge (Martin et al., 2006 (
link), accession numbers AATN01000001.1-AATN01011188.1), one from an
Olavius algarvensis microbial symbiont community (Woyke et al., 2006 (
link), accession numbers AASZ00000000.1, DS021108.1-DS022223.1), and one from an Antarctic whale fall bone (Tringe et al., 2005 (
link), accession numbers AAGA01000001.1-AAGA01026232.1). In addition, an artificial metagenome was used (SimBG, Saeed et al., 2011 (
link)). All five metagenomes were also used for evaluation by Saeed et al. (2011 (
link)). Additional information is provided in Table
1. For evaluation of the real metagenomes, we assumed that the annotations provided by the authors of the original studies were correct. In the EBPR case no annotations were provided, so we used the published genome of
Candidatus “Accumulibacter phosphatis” as the reference. After binning, a bin was assigned to each population and the accuracy was calculated as the number of correctly binned nucleotides divided by the total number of nucleotides in the bin (×100%). Recall was calculated as the number of nucleotides of the source organism assigned to the bin divided by the total number of nucleotides of the source organism present in the metagenome (×100%). For the Whale Fall metagenome, evaluation of accuracy and recall was impossible, as binning was reported to be unsuccessful by the authors. The results (accuracy, recall, and computation time) were compared to two comparable previously published state of the art
de novo compositional binners (Kelley and Salzberg, 2010 (
link); Saeed et al., 2011 (
link)). For SCIMM (Kelley and Salzberg, 2010 (
link)), bins were seeded with a single trial of Likely Bin and the algorithm was run multiple times with different estimates for the number of populations. In Table
1 only the results for the optimal choice are shown. 2T binner was run with the default options.
Strous M., Kraft B., Bisdorf R, & Tegetmeyer H.E. (2012). The Binning of Metagenomic Contigs for Microbial Physiology of Mixed Cultures. Frontiers in Microbiology, 3, 410.