Phylogenetic trees were generated by exporting ribosomal protein gene sequences from the strain database as an XMFA file containing each locus as an aligned block. ClonalFrame analysis was performed for single genus datasets using ClonalFrame version 1.2 (Didelot & Falush, 2007 (link)) with default parameters. For larger datasets, up to the entire bacterial domain, the XMFA file was converted to an aligned concatenated sequence for Neighbor-joining tree analysis using Mega version 5 (Kumar et al., 2008 (link)) with ambiguous positions removed for each sequence pair. Split decomposition analysis was performed using SplitsTree version 4 (Huson & Bryant, 2006 (link)) for species level datasets. Dendroscope (Huson et al., 2007 (link)) was used to visualize large trees.
To assess congruence, maximum-likelihood (ML) phylogenetic trees were constructed using Paup version 4 beta 10 (Swofford, 1998 ) on finished genomes from the entire Bacilli class (n=144). ML trees for ten ribosomal protein genes (rpsB, rpsC, rpsD, rpsE, rpsG, rpsI, rpsK, rpsL, rpsP and rpsT) with sizes between 400 – 1100 bp were computed and compared using the Shimodaira-Hasegawa test, which determines if significant differences occur among the tree topologies (differences in log likelihood, Δ-ln L). Randomisation tests were then performed (Holmes et al., 1999 (link)), where the Δ –ln L values for each of the genes were compared to the equivalent values computed for 200 random trees created from each gene. This analysis was carried out on finished genomes from the entire Bacilli class (n=144).