For coalescent analyses with 1785 low-copy nuclear genes, 94 new assemblies of sampled C. sinensis and two genome data (CSA ‘Yunkang 10’ and CSS ‘Shuchazao’) (Xia et al., 2017 (link); Xia et al., 2020b (link)) were used for phylogeny construction. Amino acid sequences were aligned using MAFFT v7.487 (Katoh and Standley, 2013 (link)) with the “-auto” parameter. Poorly aligned regions were further trimmed using the trimAl v1.2 (Capella-Gutiérrez et al., 2009 (link)) with the “-automated1” parameter. Multiple amino acid sequence alignments were converted to nucleotide alignments by PAL2NAL (Suyama et al., 2006 (link)). Single-gene ML trees were reconstructed using IQ-TREE v2.1.4-beta (Nguyen et al., 2015 (link)) under the GTR+ G model with 1000 bootstrap replicates. The coalescent analysis was implemented by ASTRAL.5.7.8 (Zhang et al., 2018 (link)).
For ML analyses by concatenating SNPs, a total of 108 samples included the 94 newly sequenced transcriptomes, and the RNA-seq data of CSA ‘Yunkang 10,’ and CSS ‘Biyun,’ ‘Hangdan,’ ‘Tieguanyin,’ ‘Longjing43’ and ‘Shuchazao,’ (Xia et al., 2017 (link); Wang et al., 2020 (link); Xia et al., 2020b (link); Zhang et al., 2020c (link); Wang et al., 2021b (link)) and eight wild tea species (