Except for pecan and Chinese hickory, V. vinifera and 10 other genome-sequenced representatives from the Rosids (J. regia, G. max, Medicago truncatula, P. persica, Morus notabilis, Carica papaya, Gossypium hirsutum, Theobroma cacao, Betula pendula, and Populus trichocarpa), along with A. thaliana, were selected for constructing the phylogenetic tree (J. regia genome data were downloaded from [99 ]; others were downloaded from Phytozome [100 (link)] [v12]). The protein set of each species was obtained and filtered as follows: (i) only the longest isoform was considered for further analysis if a gene encoded several isoforms; (ii) proteins of <30 amino acids were filtered out. The similarity relation between homologous proteins in all species was obtained through BLASTp with the E-value 1e–5. All the protein datasets of the 14 species were clustered into paralogous and orthologous using the program OrthoMCL [101 (link)] with the inflation parameter 1.5. Finally, 170 single-copy-gene–encoded proteins were used for the phylogenetic analysis. The protein sequences from all species were then aligned by MUSCLE (MUSCLE, RRID:SCR_011812) [102 (link)] and a super alignment matrix was generated by combining all the alignment results. A phylogenetic tree containing 14 species was constructed using RAxML (RAxML, RRID:SCR_006086) [103 (link)] with the ML method and 1,000 bootstraps. Finally, the MCMCtree program implemented in phylogenetic analysis by maximum likelihood (PAML) (PAML, RRID:SCR_014932) [104 (link)] was applied to infer the divergence time on the basis of the phylogenetic tree. The MCMCtree running parameters were as follows: burn-in, 5,000,000; sample-number, 1,000,000; sample-frequency, 50. The calibration times of divergence between A. thaliana and C. papaya (54–90 MYA), G. hirsutum and T. cacao (32–99 MYA), A. thaliana and P. trichocarpa (107–109 MYA), G. max and M. truncatula (46–60 MYA), M. notabilis and P. persica (73–90 MYA), and A. thaliana and G. max (107–111 MYA) were obtained from the TimeTree database [105 (link)].
Free full text: Click here