Each node of each tree was classified in comparison to the known evolutionary relationships of the animals. For example, if the gene cluster tree contains exactly four members, and one from each animal, then the parsimonious inference is that no gene duplication occurred. In the case of a similar cluster, but where one member is missing, this is a gene loss in a single group. Gene cluster trees can show duplications specific to individual lineages by having two genes clustered together for the same animal. A gene duplication that occurred before the split of fish from tetrapods is seen as a duplicated tree of the animal relationships after their splitting from Ciona and, similarly, a gene duplication that occurred in tetrapods but before the split of the mouse and human lineages is seen as a duplication of the mouse–human group. Combinations of these, such as a duplication for tetrapods followed by a loss in one of the tetrapod lineages, are also seen and scored (see Figure 3).
The sorting of orthologous and paralogous relationships for each gene cluster provides an effective tool for improving the inferences of gene function by allowing annotations from well studied genomes to be transferred to the orthologous genes of other species. Inferring function from orthology is expected to be more accurate than using sequence similarity alone, since the latter tends to incorrectly associate slowly evolving paralogs. We provide a web based resource for this sorting at http://phigs.org/.
Free full text: Click here