LCN genes were identified based on OrthoFinder26 (link) results. The orthologues were obtained from six monocots (Spirodela polyrhiza, Zostera marina, Musa acuminata, Ananas comosus, Sorghum bicolor and Oryza sativa) and six eudicots (Nelumbo nucifera, Vitis vinifera, Populus trichocarpa, A. thaliana, Solanum lycopersicum and Beta vulgaris), N. colorata, Amborella, and the gymnosperms G. biloba, P. abies and P. taeda. LCN genes needed to meet the following requirements: strictly single-copy in N. colorata, Amborella, G. biloba, P. abies or P. taeda, and single-copy in at least five of the 12 eudicots or monocots. With G. biloba, P. abies or P. taeda as the outgroup, we identified 2,169, 1,535 and 1,515 orthologous LCN genes, respectively. Furthermore, we trimmed the sites with less than 90% coverage. LCN gene trees were estimated from the remaining sites using RAxML v.7.7.8 using the GTR+G+I model for nucleotide sequences (Fig. 1c ) and the JTT+G+I model for amino acid sequences (Supplementary Note 4.1 ). To account for incomplete lineage sorting and different substitution rates, we applied the multispecies coalescent model and a supermatrix method, respectively, to the LCN genes and found further support for the sister relationship between Amborella and all other extant flowering plants (Supplementary Note 4.2 ).
We further carefully selected five LCN gene sets (1,167, 834, 683, 602 and 445) from 115 species and applied both a supermatrix method27 (link)–29 (link) and the multi-species coalescent model to infer the phylogeny of angiosperms (Supplementary Note4.2 ). The phylogeny inferred from 1,167 LCN genes is shown in Fig. 1d , with different support values from the multi-species coalescent analyses of the other four LCN gene sets.
To estimate the evolutionary timescale of angiosperms, we calibrated a relaxed molecular clock using 21 fossil-based age constraints7 (link) throughout the tree, including the earliest fossil tricoplate pollen (approximately 125 Ma) associated with eudicots30 . We concatenated 101 selected genes (205,185 sites) and fixed the tree topology to that inferred from our coalescent-based analysis of 1,167 genes from 115 taxa. We performed a Bayesian phylogenomic dating analysis of the 101 selected genes in MCMCtree, part of the PAML package31 (link),32 (link), and used approximate likelihood calculation for the branch lengths33 (link). Molecular dating was performed using an auto-correlated model of among-lineage rate variation, the GTR substitution model, and a uniform prior on the relative node times. Posterior distributions of node ages were estimated using Markov chain Monte Carlo sampling, with samples drawn every 250 steps over 10 million steps following a burn-in of 500,000 steps. We checked for convergence by running the analysis in duplicate and checked for sufficient sampling.
We also implemented the penalized likelihood method under a variable substitution rate using TreePL34 (link) and r8s35 (link), as a constant substitution rate across the phylogenetic tree was rejected (P < 0.01) for all cases by likelihood-ratio tests in PAUP36 . Three fossil calibrations, corresponding to the crown groups of Lamiales, Cornales and Laurales, were implemented as minimum age constraints in our penalized likelihood dating analysis, except that the earliest appearance of tricolpate pollen grains (about 125 Ma)30 was used to fix the age of crown eudicots. We determined the best smoothing parameter value of the concatenated 101 LCN genes as 0.32 by performing cross-validations of a range of smooth parameters from 0.01 to 10,000 (algorithm = TN; crossv = yes; cvstart = −2; cvinc = 0.5; cvnum = 15). We used 100 bootstrap trees with branch lengths generated by RAxML37 (link) to infer the 95% confidence intervals of age estimates (Supplementary Note4.2 ).
We further carefully selected five LCN gene sets (1,167, 834, 683, 602 and 445) from 115 species and applied both a supermatrix method27 (link)–29 (link) and the multi-species coalescent model to infer the phylogeny of angiosperms (Supplementary Note
To estimate the evolutionary timescale of angiosperms, we calibrated a relaxed molecular clock using 21 fossil-based age constraints7 (link) throughout the tree, including the earliest fossil tricoplate pollen (approximately 125 Ma) associated with eudicots30 . We concatenated 101 selected genes (205,185 sites) and fixed the tree topology to that inferred from our coalescent-based analysis of 1,167 genes from 115 taxa. We performed a Bayesian phylogenomic dating analysis of the 101 selected genes in MCMCtree, part of the PAML package31 (link),32 (link), and used approximate likelihood calculation for the branch lengths33 (link). Molecular dating was performed using an auto-correlated model of among-lineage rate variation, the GTR substitution model, and a uniform prior on the relative node times. Posterior distributions of node ages were estimated using Markov chain Monte Carlo sampling, with samples drawn every 250 steps over 10 million steps following a burn-in of 500,000 steps. We checked for convergence by running the analysis in duplicate and checked for sufficient sampling.
We also implemented the penalized likelihood method under a variable substitution rate using TreePL34 (link) and r8s35 (link), as a constant substitution rate across the phylogenetic tree was rejected (P < 0.01) for all cases by likelihood-ratio tests in PAUP36 . Three fossil calibrations, corresponding to the crown groups of Lamiales, Cornales and Laurales, were implemented as minimum age constraints in our penalized likelihood dating analysis, except that the earliest appearance of tricolpate pollen grains (about 125 Ma)30 was used to fix the age of crown eudicots. We determined the best smoothing parameter value of the concatenated 101 LCN genes as 0.32 by performing cross-validations of a range of smooth parameters from 0.01 to 10,000 (algorithm = TN; crossv = yes; cvstart = −2; cvinc = 0.5; cvnum = 15). We used 100 bootstrap trees with branch lengths generated by RAxML37 (link) to infer the 95% confidence intervals of age estimates (Supplementary Note
Full text: Click here