Supervised Classification Methods for Metabolomics Data

Class prediction using metabolomics data is increasingly important in studies aiming for early diagnosis, prognosis or treatment outcomes. MetaboAnalyst offers three powerful supervised classification methods—PLS-DA, random forest (22 ) and support vector machine (SVM). These methods have proved to be robust for high-dimensional data and are widely used for other ‘omics’ data analysis. In addition, they can also help prioritize features that contribute significantly to the performance. PLS-DA based feature selection and classification was previously discussed in the chemometrics path. Random forest uses an ensemble of classification trees, each of which is grown by random feature selection from a bootstrap sample at each branch. Class prediction is based on the majority vote of the ensemble. During tree construction, about one-third of the instances are left out of the bootstrap sample. This data is then used as test sample to obtain an unbiased estimate of the classification (OOB) error. Variable importance is evaluated by measuring the increase of the OOB error when it is permuted. Figure 2D shows the important features ranked by random forest. The SVM classification algorithm aims to find a nonlinear decision function in the input space by mapping the data into a higher dimensional feature space and separating it by means of a maximum margin hyperplane (23 ). MetaboAnalyst's SVM analysis is done through recursive feature selection and sample classification using a linear kernel (24 (link)). Features are selected based on their relative contribution in the classification using cross validation error rates. The least important features are eliminated in the subsequent steps. This process creates a series of SVM models. The features used by the best model are considered to be important and are ranked by their frequencies of being selected in the model.

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Xia J., Psychogios N., Young N, & Wishart D.S. (2009). MetaboAnalyst: a web server for metabolomic data analysis and interpretation. Nucleic Acids Research, 37(Web Server issue), W652-W660.

Publication 2009

Early diagnosis Prognosis Tree

Corresponding Organization :

Other organizations : University of Alberta, National Institute for Nanotechnology

Top 5 similar protocols

Protocol cited in 33 other protocols

Variable analysis

independent variables

Metabolomics data

dependent variables

Class prediction
Early diagnosis
Prognosis
Treatment outcomes

control variables

Not explicitly mentioned

positive controls

Not specified

negative controls

Not specified

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!