The experimental evaluation can be summarized into five main steps: 1) cross-validation analysis was done on the six disease-association datasets for evaluating the capabilities of metagenomic data for disease classification; 2) cross-stage studies were performed on the cirrhosis and T2D datasets in order to test the generalization of the model on independent collection batches from the same study; 3) in terms of T2D, the analysis was extended by taking into account also samples from completely distinct cohorts; 4) cross-studies were also done to model the features of the “healthy” gut microbiome for use as a dysbiosis prediction model for syndromes where few or no training samples are available; 5) cross-validation and cross-study analysis were applied to deal with different classification problem such as gender and body site discrimination. We note that all the investigated classification problems, excluding the body site discrimination, represented binary classification problems. Moreover, most of the analysis was done in terms of disease classification, in which the objective was to discriminate between “healthy” and “diseased” subjects.
Free full text: Click here