Curation of Bacterial Virulence Factors

The bacterial virulent protein sequences were retrieved from the SWISS-PROT [20 (link)] and VFDB (an integrated and comprehensive database of virulence factors of bacterial pathogens, [21 (link)]). SWISS-PROT sequences were retrieved using keywords such as virulence, adhesin, adhesion, adherence, toxin, invasion, capsule and other terms related to virulence factors. The VFDB and SWISS-PROT sequences were screened strictly in order to obtain a high quality dataset. First, the sequences were filtered to remove entries annotated as "Probable", Putative", "By similarity", "Fragments" "Hypothetical", "Unknown" and "Possible". The filtering yielded 1756 annotated virulent protein sequences (henceforth referred to as positive dataset).
For training with non-virulent protein sequences, we selected 3000 annotated protein sequences of bacterial enzymes and other non-virulent proteins from SWISS-PROT database (these sequences are henceforth referred to as negative dataset). The negative dataset sequences were mainly chosen from the bacterial proteomes, the virulent protein sequences of which are included in the positive dataset.

Free full text: Click here

Garg A, & Gupta D. (2008). VirulentPred: a SVM based prediction method for virulent proteins in bacterial pathogens. BMC Bioinformatics, 9, 62.

Publication 2008

Adhesin Bacterial Bacterial protein Capsule Enzymes Protein sequences Proteomes Toxin Virulence Virulence factors

Corresponding Organization :

Other organizations : International Centre for Genetic Engineering and Biotechnology

Top 5 similar protocols

Protocol cited in 54 other protocols

Variable analysis

independent variables

Bacterial virulent protein sequences retrieved from SWISS-PROT and VFDB databases
Non-virulent protein sequences of bacterial enzymes and other non-virulent proteins selected from SWISS-PROT database

dependent variables

Not explicitly mentioned

control variables

Bacterial virulent protein sequences were strictly screened to obtain a high-quality dataset by removing entries annotated as 'Probable', 'Putative', 'By similarity', 'Fragments', 'Hypothetical', 'Unknown' and 'Possible'
Negative dataset sequences were mainly chosen from the bacterial proteomes, the virulent protein sequences of which are included in the positive dataset

positive controls

Not explicitly mentioned

negative controls

Not explicitly mentioned

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!