Cheminformatic metrics, including molecular weight, number of hydrogen bond donors and acceptors, octanol-water partition coefficients, and Bertz topological complexity, were calculated in RDKit. Both platforms occasionally generated very small, non-specific structure predictions (for example, a single unspecified amino acid or a single malonyl unit) that did not provide actionable information about the chemical structure of the encoded product; to remove these from consideration, we applied a molecular weight filter to remove structures under 100 Da output by either platform. To evaluate the internal structural diversity of each set of predicted structures, we computed the distribution of pairwise Tcs for each set45 , taking the median pairwise Tc instead of the mean as a summary statistic to ensure robustness against outliers. Structural similarity to known natural products was assessed using the RDKit implementation of the ‘natural product-likeness’ score22 (link), and by the median Tc between predicted structures and the known secondary metabolite structures deposited in the NP Atlas database46 (link).
Comprehensive Microbial Secondary Metabolite Prediction
Cheminformatic metrics, including molecular weight, number of hydrogen bond donors and acceptors, octanol-water partition coefficients, and Bertz topological complexity, were calculated in RDKit. Both platforms occasionally generated very small, non-specific structure predictions (for example, a single unspecified amino acid or a single malonyl unit) that did not provide actionable information about the chemical structure of the encoded product; to remove these from consideration, we applied a molecular weight filter to remove structures under 100 Da output by either platform. To evaluate the internal structural diversity of each set of predicted structures, we computed the distribution of pairwise Tcs for each set45 , taking the median pairwise Tc instead of the mean as a summary statistic to ensure robustness against outliers. Structural similarity to known natural products was assessed using the RDKit implementation of the ‘natural product-likeness’ score22 (link), and by the median Tc between predicted structures and the known secondary metabolite structures deposited in the NP Atlas database46 (link).
Corresponding Organization :
Other organizations : McMaster University
Protocol cited in 13 other protocols
Variable analysis
- PRISM 4 and antiSMASH 5 tools used to predict chemical structures of secondary metabolites
- Complete bacterial genomes downloaded from NCBI Genome
- Dereplicated genomes retained to mitigate impact of highly similar genomes
- Dereplicated MAGs obtained from NCBI BioProject
- Chemical structures of secondary metabolites encoded within bacterial genomes and MAGs
- Cheminformatic metrics (molecular weight, hydrogen bond donors/acceptors, octanol-water partition coefficients, Bertz topological complexity)
- Internal structural diversity of predicted structures (median pairwise Tanimoto coefficient)
- Structural similarity to known natural products (natural product-likeness score, median Tanimoto coefficient to NP Atlas database)
- Molecular weight filter to remove structures under 100 Da output by either platform
Annotations
Based on most similar protocols
As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.
About PubCompare
Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.
We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.
However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.
Ready to get started?
Sign up for free.
Registration takes 20 seconds.
Available from any computer
No download required
Revolutionizing how scientists
search and build protocols!