We reconstructed the complete collection of phylogenetic trees, also known as the Phylome, for all A. pisum protein-coding genes with homologs in other sequenced insect genomes. For this we used a similar automated pipeline to that described earlier for the human genome [43] (link). A database was created containing the pea aphid proteome and that of 16 other species. These include 12 other insects (Tribolium castaneum, Nasonia vitripennis, Apis mellifera [from NCBI database], Drosophila pseudoobscura, Drosophila melanogaster, Drosophila mojavensis, Drosophila yakuba [from FlyBase], Pediculus humanus, Culex pipiens [from VectorBase], Anopheles gambiae, Aedes aegypti [from Ensembl], and Bombyx mori [from SILKDB]) and four outgroups (the crustacean Daphnia pulex [the GNOMON predicted set provided by the JGI], the nematode Caenorhabditis elegans, and two chordates, Ciona intestinalis and Homo sapiens [from Ensembl]). For each protein encoded in the pea aphid genome, a Smith-Waterman [106] (link) search (e-val 10−3) was performed against the above mentioned proteomes. Sequences that aligned with a continuous region longer than 50% of the query sequence were selected and aligned using MUSCLE 3.6 [107] with default parameters. Gappy positions were removed using trimAl v1.0 (http://trimal.cgenomics.org), using a gap threshold of 25% and a conservation threshold of 50%. Phylogenetic trees were estimated with Neighbor Joining (NJ) trees using scoredist distances as implemented in BioNJ [108] (link) and by ML as implemented in PhyML v2.4.4 [105] (link), using JTT as an evolutionary model and assuming a discrete gamma-distribution model with four rate categories and invariant sites, where the gamma shape parameter and the fraction of invariant sites were estimated from the data. Support for the different partitions was computed by approximate likelihood ratio test as implemented in PhymL (aLRT) [109] (link). All trees and alignments have been deposited in PhylomeDB [110] (link) (http://phylomedb.org). Additional details for this analysis can be found in [110] (link).
Free full text: Click here