The nucleotide distance between two quasispecies [2 ], X and Y, may be estimated by: DXY=iXjYpidijqj
where pi and qj are the proportion of the i-th haplotype in quasispecies X, and that of the j-th haplotype in quasispecies Y, and dij is the genetic distance between both haplotypes. The sum extends over all haplotypes in both quasispecies. This distance is interpreted as the average number of nucleotide substitutions between the reads from quasispecies X and quasispecies Y.
Taking into account the nucleotide diversity of each quasispecies [2 ], that is the average number of nucleotide substitutions for a random pair of reads in the quasispecies, DX and DY , which may be estimated by: DX=NXNX1iXjXpidijpj
DY=NYNY1iYjYqidijqj
where NX and NY are the number of reads in each quasispecies, then the net nucleotide substitutions between the two quasispecies [2 ] is estimated by: DA=DXY(DX+DY)/2
DA will be taken as the genetic distance between two quasispecies.
The quasispecies pairs are simulated in a way that all haplotypes are considered to have a single substitution with respect to the master haplotype in the first quasispecies. In this way, the matrix of distances between all pairs of haplotypes in both quasispecies has the form: D:dij=0,i=jdij=1,i=1andj>1dij=1,j=1andi>1dij=2otherwise
Free full text: Click here