The KS domain of iterative type I PKS has been considered evolutionarily conserved (55 (link)), and thus it can serve as a proxy for the similarity of the entire PKS. We identified a total of 242 PKSs from the genomes of the six Cladonia spp. and S. alpinum, among which four PKSs (Cgr01615, Cgr03964, Cgr08611, and Cmt10189) (Data Set S2) were missing a KS domain. KS domain sequences were extracted from 238 PKSs using the online tool NaPDoS (70 (link)) and aligned using MUSCLE (v3.8.31) (71 (link)). For clustering analysis, an all-versus-all similarity matrix for KS domains of 238 PKSs was computed using the AlignBuddy function in the BuddySuite program (72 (link)), with an optional argument, “-pi.” A heat map showing the percentage of similarity of KS domains clustered by k-means was generated using the R package Superheat (73 (link)). For fungal NR-PKS phylogeny, we used concatenated protein sequences of KS and PT domains of 103 NR-PKSs found in the six Cladonia spp. and S. alpinum and 82 NR-PKSs that have been linked to known compounds in nonlichenized fungi (7 (link), 8 (link), 44 (link)) (Data Set S2). We initially identified 106 NR-PKSs in the six Cladonia spp. and S. alpinum, including seven NR-PKSs whose full sequences cannot be reliably defined from the current genome assembly (likely pseudogenes). Among these partial PKSs, three NR-PKSs (Cbo04702, Cma06590, and Cmt06606) (Data Set S2) lacked either the KS or PT domain and were excluded from the phylogenetic analysis. PT domains of lichen NR-PKSs were identified by aligning with those of previously characterized PKSs (8 (link)). A 6-methylsalicylic acid synthase (6MSAS) responsible for the biosynthesis of patulin (UniProtKB accession no. A0A075TRC0) in Penicillium expansum was set to be an outgroup to fungal NR-PKS phylogeny. Also, four 6MSASs found in four Cladonia spp. were included in the analysis (Cbo07291, Cgr05254, Cmt10005, and Cuc03485) (Data Set S2). Protein sequences of KS and PT domains were aligned using MAFFT (v7.310) (65 (link)) with the “auto” setting, and spurious sequences or poorly aligned regions from each domain were trimmed using the trimAl program (v1.2) (74 (link)), with the “gappyout” parameter. The resulting multiple-sequence alignments for KS and PT domains were concatenated with FASconCAT-G (v1.04) (75 (link)). From the concatenated sequences, maximum likelihood trees were computed with RAxML (v8.2) (67 (link)), using a gamma distribution for substitution rate across sites with the parameter setting “-m PROTGAMMAWAG.” Nodal support was evaluated by 1,000 bootstrap replications. The final tree was rooted to the 6MSAS outgroup and annotated by iTOL (v5.7) (76 (link)).
Free full text: Click here