The OrthoFinder97 (link) clustering method was used to classify complete proteomes of 23 sequenced green plant genomes, including A.filiculoides and S.cucullata (Supplementary Table 5), into orthologous gene lineages (that is, orthogroups). We selected taxa that represented all of the major land plant and green algal lineages, including six core eudicots (A.thaliana, Lotus japonicus, Populus trichocarpa, Solanum lycopersicum, Erythranthe guttata and Vitis vinifera), five monocots (O.sativa, Sorghum bicolor, Musa acuminata, Zostera marina and Spirodella polyrhiza), one basal angiosperm (A.trichopoda), two gymnosperms (Pinus taeda and Picea abies), two ferns (A.filiculoides and S.cucullata), one lycophyte (S.moellendorffii), four bryophytes (Sphagnum fallax, P.patens, Marchantia polymorpha and Jungermannia infusca) and two green algae (Klebsormidium flaccidum and C.reinhardtii). In total, 16,817 orthogroups containing at least two genes were circumscribed, 8,680 of which contain at least one gene from either A.filiculoides or S.cucullata. Of the 20,203 annotated A.filiculoides genes and the 19,780 annotated S.cucullata genes, 17,941 (89%) and 16,807 (84%) were classified into orthogroups, respectively. The details for each orthogroup, including gene counts, secondary clustering of orthogroups (that is, super-orthogroups)110 (link) and functional annotations, are reported in Supplementary Table 5.
We used Wagner parsimony implemented in the program Count111 (link) with a weighted gene gain penalty of 1.2 to reconstruct the ancestral gene content at key nodes in the phylogeny of the 23 land plants and green algae species (Supplementary Table 5). The ancestral gene content dynamics—gains, losses, expansions and contractions—are depicted in Supplementary Fig. 5. Complete details of orthogroup dynamics for the key ancestral nodes that include seed plants, such as Salviniaceae, euphyllophytes and vascular plants, are reported in Supplementary Table 5.
Free full text: Click here