In total six bacterial datasets were used for testing the performance of the software. These comprise Illumina MiSeq, Roche-454 and PacBio RS reads from Escherichia coli (K12 MG1655), Escherichia coli (O157:H7 F8092B), Bibersteina trehalosi (USDA-ARS-USMARC-192), Mannheimia haemolytica (USDA-ARS-USMARC-2286), Francisella tularensis (99A-2628) and Salmonella enterica (Newport SN31241). Datasets are downloaded from http://www.cbcb.umd.edu/software/PBcR/closure/index.html and further described in Koren et al. (2013). Dataset statistics are displayed in Table 1. To assess the assembly correctness we used close reference genomes deposited in the NCBI database (E. coli K12 MG1655 = NC_000913, E. coli O157:H7 = NC_002127, NC_002128, NC_002695, F. tularensis = NC_008369, S. enterica = NC_011079, NC_011080, NC_009140). For B. trehalosi and M. haemolytica no reference genome is currently available.
Free full text: Click here