Identifying Core Genome Orthologs in E. coli/Shigella

A preliminary set of orthologs was defined by identifying unique pairwise reciprocal best hits, with at least 80% similarity (∼85% identity) in amino acid sequence and less than 20% difference in protein length. The analysis of orthology was made for every pair of E. coli/Shigella genomes. The core genome, consisting of genes ubiquitously found among all strains of the species, was defined as the intersection of pairwise lists.
For every pair of genomes this list of persistent orthologs was then supplemented, with attention to conservation of gene order. Because (i) few rearrangements are observed at these short evolutionary distances, and (ii) horizontal gene transfer is frequent, genes outside conserved blocks of synteny are likely to be xenologs or paralogs. Hence, we combined the homology analysis (protein sequence similarity ≥80%, ≤20% difference in protein length) with the classification of these genes as either syntenic or nonsyntenic, for positional orthology determination. The analysis was made for every pair of E. coli/Shigella genomes. The definitive list of orthologs of the pan-genome was then defined as the union of pairwise lists.
A syntenic block was defined as a set of consecutive pairs of genes in the core genome. Conserved order gene blocks are obtained by comparison of the localisation of best bi-directional hit pairs in the core genome, adopting a window size of one gap.
These lists were also used to perform gene accumulation curves using R, which describe the number of new genes and genes in common, with the addition of new comparative genomes (Figure 1). The procedure was repeated 1000 times by randomly modifying genome insertion order to obtain median and quartiles.

Free full text: Click here

Touchon M., Hoede C., Tenaillon O., Barbe V., Baeriswyl S., Bidet P., Bingen E., Bonacorsi S., Bouchier C., Bouvet O., Calteau A., Chiapello H., Clermont O., Cruveiller S., Danchin A., Diard M., Dossat C., Karoui M.E., Frapy E., Garry L., Ghigo J.M., Gilles A.M., Johnson J., Le Bouguénec C., Lescat M., Mangenot S., Martinez-Jéhanne V., Matic I., Nassif X., Oztas S., Petit M.A., Pichon C., Rouy Z., Ruf C.S., Schneider D., Tourret J., Vacherie B., Vallenet D., Médigue C., Rocha E.P, & Denamur E. (2009). Organised Genome Dynamics in the Escherichia coli Species Results in Highly Diverse Adaptive Paths. PLoS Genetics, 5(1), e1000344.

Publication 2009

Attention Conserved synteny Evolutionary Gene order Genes Genome Horizontal gene transfer Protein Protein sequence Rearrangements Shigella Strains Syntenic

Corresponding Organization :

Other organizations : Sorbonne Université, Institut Pasteur, Centre National de la Recherche Scientifique, Délégation Paris 7, Inserm, Université Paris Cité, Genoscope, Commissariat à l'Énergie Atomique et aux Énergies Alternatives, Délégation Paris 5, Hôpital Robert-Debré, Assistance Publique – Hôpitaux de Paris, Mathématiques et Informatique Appliquées du Génome à l'Environnement, University of Minnesota, Veterans Health Administration, Université Joseph Fourier, Université Grenoble Alpes

Top 5 similar protocols

Protocol cited in 14 other protocols

Variable analysis

independent variables

Percentage of similarity in amino acid sequence (≥80%)
Difference in protein length (≤20%)
Syntenic or non-syntenic classification of genes

dependent variables

Identification of unique pairwise reciprocal best hits (orthologs)
Composition of the core genome (genes ubiquitously found among all strains)
Composition of the pan-genome (definitive list of orthologs)

control variables

Evolutionary distance between E. coli and Shigella genomes (short)
Frequency of horizontal gene transfer

controls

Positive control: Pairwise reciprocal best hits with ≥80% similarity and ≤20% difference in protein length
Negative control: Genes outside conserved blocks of synteny (likely xenologs or paralogs)

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!