Signature enzymes for major classes of secondary metabolites were found using profile Hidden Markov Models (pHMMs) and the program HMMER [18 (link)]. The pHMMs used are a mixture of those reported by Medema et al. [19 (link)] with the same cut-offs mentioned therein for PKS I, PKS II, PKS III, NRPS, indolocarbazoles, aerobactin-like siderophores, butyrolactones, aminoglycosides, and β-lactams, including screening for fatty acid synthases that are hit by the PKS models. New pHMMs were made for discovery of terpene synthases based on the sequences published in [20 (link)], lanthipeptides based on the required cyclase domain, see [21 (link)] for review, and thiazole-oxazole modified microcins, or TOMMs based on the YcaO domain [22 (link)]. The new pHMMs and alignments are presented in a stand-alone website (see Additional file 1). Phosphonates were found using a BLAST search and screening for sequences containing the EDK-X(5)-NS motif present in all verified PepM sequences (see [23 (link)] for review). Gene clusters were defined by extending six genes to either side of a significant pHMM hit (past the specified cut-off), joining additional hits within that window into the same cluster, and re-initiating the six gene count after encountering additional hits. The six gene extension was a practical choice; when we defined gene clusters with longer extensions the comparisons included more noise (divergent genomic neighborhoods not related to biosynthetic genes), and fewer genes in each cluster resulted in too little data for comparisons. This choice was made with future automation in mind. Similar gene clusters were found using an array of tools including phylogenetic comparisons and Mauve [24 (link)] alignments after concatenation of all gene clusters in each strain into one sequence. A website showing all gene clusters are included as Additional file 1. Gene cluster diagrams also include domain annotations, but these are not manually curated and some domains are incorrectly split in half. Gene annotation and domain names are available on mouseover.
Free full text: Click here