Multispecies Ortholog Identification Algorithms

The MultiParanoid algorithm [10] (link) is an extension of the
graph-based InParanoid clustering algorithm [11] (link), [42] (link) for
identifying orthologs and inparalogs across multiple species.
InParanoid uses bi-directional best BLAST [9] (link), [43] (link) to
identify putative orthologs and a clustering algorithm to identify their
inparalogs. To do so, InParanoid assumes that any sequences
from the same species that are more similar to the predicted ortholog than to
any sequence from other species are inparalogs [11] (link), [42] (link).
MultiParanoid generates multi-species orthogroups by
merging all pairwise InParanoid predictions, while minimizing
the number of internal conflicts. Furthermore, the algorithm uses a
‘cut-off’ parameter based on the distance of candidate inparalogs to
the predicted target ortholog to filter out weakly supported candidates.
MultiParanoid was obtained from http://multiparanoid.sbc.su.se and InParanoid(version 3beta) was obtained upon request from inparanoid@sbc.su.se.
The OrthoMCL algorithm also builds upon the InParanoidalgorithm [11] (link),
[42] (link) by
using the Markov Cluster (MCL) algorithm for predicting orthogroups across
multiple species based on their sequence similarity information [3] (link). The algorithm
uses an ‘inflation rate’ parameter, to regulate the
‘tightness’ of the predicted orthogroups. OrthoMCL (version
1.4) was obtained from http://orthomcl.org/common/downloads/software/v1.4/.
The Reciprocal Best Hit (RBH) algorithm [4] (link), [6] (link), [12] (link), [13] (link) relies on BLAST [9] (link), [43] (link) to
identify pairwise orthologs between two species. According to the RBH algorithm,
two proteins X and Y from species
x and y, respectively, are considered
orthologs if protein X is the best BLAST hit for protein
Y and protein Y is the best BLAST hit for
protein X. We integrated a ‘filtering’ parameter
r that enabled us to avoid constructing orthogroups that
contained distant homologs by considering the degree by which the two proteins
differed in sequence length or BLAST alignment [44] (link), [45] (link). Thus, putative
orthogroups are retained if:
From the above equation, it follows that r values close to 1 are
likely to filter out a larger number of putative orthologs, whereas
r values close to 0 are likely to include all putative
orthologs. The default mode of the algorithm does not use the filtering
parameter r.
The Reciprocal Smallest Distance (RSD) algorithm [14] (link) generates global sequence
alignments for a small number of top BLAST hits against a query gene
X from species x. RSD then calculates the
maximum likelihood evolutionary distance between X and its top
BLAST hits, identifying the gene with the smallest evolutionary distance from
X (e.g., gene Y from species
y). If the RSD search using gene Y from
species y as the query also identifies gene Xfrom species x as its closest relative, then proteins
X and Y are considered orthologs [14] (link), [15] (link). In RSD,
the user can modify the shape parameter a of the gamma
distribution, a key determinant of the estimated evolutionary distance between
genes. The RSD algorithm was obtained from http://roundup.hms.harvard.edu/site/.

Free full text: Click here

Salichos L, & Rokas A. (2011). Evaluating Ortholog Prediction Algorithms in a Yeast Model Clade. PLoS ONE, 6(4), e18755.

Publication 2011

Evolutionary Gene Gene evolutionary Protein Protein x Roundup Small global

Corresponding Organization :

Other organizations : Vanderbilt University

Top 5 similar protocols

Protocol cited in 7 other protocols

Variable analysis

independent variables

The 'cut-off' parameter based on the distance of candidate inparalogs to the predicted target ortholog to filter out weakly supported candidates in the MultiParanoid algorithm
The 'inflation rate' parameter to regulate the 'tightness' of the predicted orthogroups in the OrthoMCL algorithm
The 'filtering' parameter 'r' to avoid constructing orthogroups that contained distant homologs in the Reciprocal Best Hit (RBH) algorithm
The shape parameter 'a' of the gamma distribution, a key determinant of the estimated evolutionary distance between genes in the Reciprocal Smallest Distance (RSD) algorithm

dependent variables

Identification of orthologs and inparalogs across multiple species using the MultiParanoid algorithm
Prediction of orthogroups across multiple species based on their sequence similarity information using the OrthoMCL algorithm
Identification of pairwise orthologs between two species using the Reciprocal Best Hit (RBH) algorithm
Identification of orthologs based on the maximum likelihood evolutionary distance between genes using the Reciprocal Smallest Distance (RSD) algorithm

control variables

Not explicitly mentioned

controls

Not specified

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!