Evaluating FragGeneScan Performance on Microbial Genomes
A total of nine complete genomes (with various GC contents) and their annotations were downloaded from the NCBI website (http://www.ncbi.nlm.nih.gov/) (Table 1). (This set of genomes does not overlap with the genomes we used for training.) To systematically test FragGeneScan, reads of various lengths (100, 200, 400 and 700 bp) and with various sequencing error rates (0–3%) were simulated from these genomes using MetaSim (10 (link)). For each genome, up to 1-fold coverage of reads was sampled for each read length and sequencing error rate. Based on the current estimation of sequencing error rates (10 (link)), Sanger sequencing reads of 700 bp were simulated with the error rates ranging from 0% to 1%, and 454 sequencing reads were simulated with the error rates ranging from 0% to 3%.
Genomes of microbial species that were used to evaluate the performance of FragGeneScan
Species
Gene Bank Acc.
CG (%)
Genome size (Mb)
No. of genes
Buchnera aphidicola str. APS
NC_002528
26
0.6
564
Burkholderia pseudomallei K96243 chr1
NC_006350
67
4.1
3399
Bacillus subtilis subsp. subtilis str. 168
NC_000964
43
4.2
4105
Corynebacterium jeikeium K411
NC_007164
61
2.5
2104
Chlorobium tepidum TLS
NC_002932
56
2.2
2252
Escherichia coli str. K-12 substr. MG1655
NC_000913
50
4.6
4132
Helicobacter pylori J99
NC_000921
39
1.6
1489
Prochlorococcus marinus str. MIT 9312
NC_007577
31
1.7
1810
Wolbachia endosymbiont str. TRS
NC_006833
34
1.1
805
Three real metagenomes were used for gene prediction in metagenomic sequences (Supplementary Table S3). Two real metagenomes (TS28 and TS50) from the twin obese and lean study (14 (link)) were downloaded from the MG-RAST website (http://metagenomics.nmpdr.org). The other real metagenome (SRX007415) from the rumen microbiota response study was downloaded from the NCBI website (http://www.ncbi.nlm.nih.gov). These three metagenomes were BLASTXed against 98% non-redundant protein sequences from prokaryotic genomes, plasmids and phages collected from IMG 3.0 (http://img.jgi.doe.gov) using an E-value cutoff of 1.0e-3 for TS28 and TS50, and 1.0e-1 for SRX007415 (which has shorter reads), respectively. FragGeneScan gene prediction in these metagenomes was compared to the similarity search results.
Partial Protocol Preview
This section provides a glimpse into the protocol. The remaining content is hidden due to licensing restrictions, but the full text is available at the following link:
Access Free Full Text.
Genomes used to evaluate FragGeneScan (do not overlap with genomes used for training)
Up to 1-fold coverage of reads sampled for each read length and sequencing error rate
Sanger sequencing reads of 700 bp with error rates ranging from 0% to 1%
454 sequencing reads with error rates ranging from 0% to 3%
positive controls
Not explicitly mentioned
negative controls
Not explicitly mentioned
Annotations
Based on most similar protocols
Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.
As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.
About PubCompare
Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.
We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.
However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.
Ready to
get started?
Sign up for free.
Registration takes 20 seconds.
Available from any computer
No download required