For the phylogenetic reconstructions of Wolbachia and Cardinium we downloaded representative genomes belonging to different groups from NCBI (Table S8). In the case of Wolbachia, Anaplasma phagocytophilum strain HZ and Ehrlichia canis strain Jake were used as outgroups, and for Cardinium, Amoebophilus asiaticus strain 5a2 was used. rRNA proteins were extracted, and universally conserved ones selected. The protein sequences were individually aligned using MAFFT (v7.453; ‘--localpair’ ‘--maxiterate 1000’) [50 (link)]. Divergent and ambiguously aligned blocks were removed using Gblocks (v0.91b) [51 (link)]. The alignments were then concatenated using custom perl scripts and Bayesian inference was performed using MrBayes (v3.2.7) [52 (link)], using the JTT+I+G4 substitution model. We ran two independent analyses with four chains each for 300 000 generations and checked for convergence (convergence diagnostic≤0.01).
For Rickettsiaceae we also downloaded representative genomes from NCBI and used Megaira sp. strain MegNEIS296, Megaira sp. strain MegCarteria, Occidentia massiliensis strain Os18, Orientia tsutsugamushi strain Boryong and Orientia chuto strain Dubai as outgroups (Table S8). We extracted rRNA proteins and selected universally conserved ones. The protein sequences were individually aligned using muscle (v3.8.31) [53 (link)] and the alignments were concatenated using AMAS [54 (link)]. The phylogeny was calculated with IQ-TREE 2 (v2.1.2; ‘-bnni’ ‘-alrt 1000’ ‘-m TESTNEW’ ‘--madd LG4X’ ‘-bb 1000’) using the JTTDCMut+F+R3 substitution model [55 (link)]. In order to verify if the endosymbionts belong to an existing clade or represent a new one, we calculated the average amino acid (AAI) and average nucleotide identity (ANI) for the endosymbiont and selected representative genomes using the method and thresholds described previously [56 (link)]. We did not exclude transposase genes for the ANI and AAI calculations. While a large amount of transposase genes might influence the ANI value, the AAI is regarded to be unbiased as only homologous genes are used for the calculation.
Free full text: Click here