In gene order comparisons, it is necessary to work with blocks of genes conserved in two or more genomes; trying to work with one gene at a time is not a robust procedure, especially with flowering plants, because most of these genomes have a whole genome duplication (WGD) in their history. The fractionation process ensuing from WGD deletes duplicate genes in a partially random pattern from one or the other duplicate (homeologous) chromosome, independently in two or more descendants of a duplicated genome [4 (link)]. This pattern, together with the possibility for some genes to transpose into different positions in the genome, makes it hard to identify unambiguously orthologous genes that are in the same gene order in two genomes. A set of five or ten genes in the same order, with few intervening genes, in two genomes can be confidently identified as a conserved syntenic block [5 (link), 6 (link)].
However, the notion of block adjacency encounters a number of operational problems; the genes in a syntenic block in one genome may differ somewhat from the same block in the other genome, the minimum number of genes to establish a block is a parameter that must be determined by some empirical experimentation, as is the number of genes allowed to intervene between two pairs of orthologs within a block in the two genomes. We will avoid these practical problems in our simulations by excluding fractionation or other gene loss, duplication and small transpositions from our model.
However, the notion of block adjacency encounters a number of operational problems; the genes in a syntenic block in one genome may differ somewhat from the same block in the other genome, the minimum number of genes to establish a block is a parameter that must be determined by some empirical experimentation, as is the number of genes allowed to intervene between two pairs of orthologs within a block in the two genomes. We will avoid these practical problems in our simulations by excluding fractionation or other gene loss, duplication and small transpositions from our model.