Viral Genome Host Separation Approach

The 25,000 viral protein families (VPFs) used to identify UViGs were queried against the ViralZone database (12 (link)), where viral hosts were predicted at different taxonomic levels. 11 400 VPFs had at least one hit to the virus genomes and an average of 6.8 hits per model was calculated. For each VPF, a score value (between 0 and 1) was obtained dividing the total number of hits with a uniform distribution (only present in a single host domain) by the total number of VPF hits [i.e. score = (#uniform hits/#total hits)]. In the cases where the total number of hits was below the average number of hits, we corrected the score as follows: [(#uniform hits/#total hits) × (#total hits/average #hits)].
3788 VPFs were assigned with the maximum 1.0 score, representing those models found in at least seven known viral genomes and with a uniform domain distribution. The presence of these VPFs across the UViGs allowed us to separate 65% of the viral genomes into prokaryotic (bacteriophages and archaeal viruses), or eukaryotic viruses.
This approach has been benchmarked using the host assignment of the viral genomes containing pVOGs (13 (link)) with homology to our 1.0-score VPFs (2,037 pVOGs) with ≥95% homology based on hhsearch (14 (link)). Our classification was consistent with the classification in the pVOG database in all 98.6% of the cases. The remaining 1.4% resulted in viruses annotated as ‘archaea-bacteria’ viruses in the pVOG database that were identified as either bacteria or archaea using our approach. Thus, we can estimate that there was a 100% consistency of this method separating prokaryotic and eukaryotic viruses.

Free full text: Click here

Paez-Espino D., Roux S., Chen I.M., Palaniappan K., Ratner A., Chu K., Huntemann M., Reddy T.B., Pons J.C., Llabrés M., Eloe-Fadrosh E.A., Ivanova N.N, & Kyrpides N.C. (2018). IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes. Nucleic Acids Research, 47(Database issue), D678-D686.

Publication 2018

Archaea Archaeal viruses Bacteria Bacteria viruses Eukaryotic Prokaryotic Viral genomes Viral protein Viruses

Corresponding Organization : Joint Genome Institute

Other organizations : Lawrence Berkeley National Laboratory, Universitat de les Illes Balears

Top 5 similar protocols

Protocol cited in 32 other protocols

Variable analysis

independent variables

Querying the 25,000 viral protein families (VPFs) against the ViralZone database

dependent variables

Obtaining a score value (between 0 and 1) for each VPF, representing the ratio of 'uniform hits' (hits to viral genomes with a uniform host domain distribution) to 'total hits'
Classifying viral genomes into prokaryotic (bacteriophages and archaeal viruses) or eukaryotic viruses based on the presence of VPFs with a maximum 1.0 score

control variables

Not explicitly mentioned

controls

Positive control: The host assignment of the viral genomes containing pVOGs (2,037 pVOGs) with ≥95% homology to the 1.0-score VPFs, which were consistent with the classification in the pVOG database in 98.6% of the cases.
Negative control: Not applicable based on the information provided.

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!