The 61 genes included in the analyses of Goremykin et al. [2 (link),3 (link)] and Leebens-Mack et al. [5 (link)] were extracted from our new chloroplast genome sequences of Vitis using the organellar genome annotation program DOGMA [76 (link)]. The same set of 61 genes was extracted from chloroplast genome sequences of six other recently sequenced angiosperm chloroplast genomes, including tomato, potato, soybean, cotton, cucumber, and Eucalyptus (see Table 3 for complete list of genomes examined). In general, alignment of the DNA sequences was straightforward and simply involved adding the 61 genes for the new angiosperms to the aligned data matrix from Leebens-Mack et al. [5 (link)]. In some cases, small in-frame insertions or deletions were required for correct alignment. For two genes, ccsA and matK, the DNA sequences were more divergent, requiring alignment using ClustalX [78 ] followed by manual adjustments. The complete nucleotide alignment is available online at [79 (link)].
Phylogenetic analyses using maximum parsimony (MP) and maximum likelihood (ML) were performed using PAUP* version 4.10 [80 ] on two data sets, one including 28 taxa and a second including 29 taxa by the addition of Gossypium. Phylogenetic analyses excluded gap regions. All MP searches included 100 random addition replicates and TBR branch swapping with the Multrees option. Modeltest 3.7 [81 (link)] was used to determine the most appropriate model of DNA sequence evolution for the combined 61-gene dataset. Hierarchical likelihood ratio tests and the Akaikle information criterion were used to assess which of the 56 models best fit the data, which was determined to be GTR + I + Γ by both criteria. For ML analyses we performed an initial parsimony search with 100 random addition sequence replicates and TBR branch swapping, which resulted in a single tree. Model parameters were optimized onto the parsimony tree. We fixed these parameters and performed a ML analysis with three random addition sequence replicates and TBR branch swapping. The resulting ML tree was used to re-optimize model parameters, which then were fixed for another ML search with three random addition sequence replicates and TBR branch swapping. This successive approximation procedure was repeated until the same tree topology and model parameters were recovered in multiple, consecutive iterations. This tree was accepted as the final ML tree (Figs. 3B, 4B). Successive approximation has been shown to perform as well as full-optimization analyses for a number of empirical and simulated datasets [82 (link)]. Non-parametric bootstrap analyses [83 ] were performed for MP analyses with 1000 replicates with TBR branch swapping, 1 random addition replicate, and the Multrees option and for ML analyses with 100 replicates with NNI branch swapping, 1 random addition replicate, and the Multrees option.
Free full text: Click here