ANNs: ANNs are supervised algorithms, which identify relations between drug targets and clinical elements of the network.43 (link), 60 (link) This strategy is able to identify relationships among regions of the network by inferring the probability of the existence of a specific relationship between two or more protein sets (relationship between Sacubitril/Valsartan protein targets and cardiac remodeling pathway), based on a validation of the predictive capacity of the model towards the truth table. The creation, validation, refinement and checking of the mathematical model that explains the behavior of the network is done by using known data (Known Input) about targets, MoA of drugs (Hidden MoA), and their clinical observable effects contained in truth table (Known Output) (Supplementary Fig.
The raw information that is fed into the network is known as Input layer. The learning methodology used consisted in an architecture of stratified ensembles of neural networks as a model, trained with a gradient descent algorithm to approximate the values of the given truth table. In order to correctly predict the effect of a drug independently of the number of targets, different ensemble of neural networks are trained for different subset of drugs according to their number of targets (drugs with 1 target, 2 targets, 3 targets…). Then, the predictions for a query drug are calculated by all the ensembles, and pondered according to the number of targets of the query drug.
Specifically, the neural network model used is a multilayer perceptron (MLP) neural network classifier.61 –63 (link) MLP gradient descent training depends on randomization initialization. In this way each training process, applying exactly the same truth table, can give slightly different resulting models. In order to generate each of the ensembles, 1000 MLPs are trained with the training subset. The best 100 ones are used as ensemble. When a new drug-indication pair has to be classified as probable or false, the features describing the topological relation between targets and indication effectors are classified with each of the ensembles; in order to obtain the most accurate prediction, the difference between the number of targets of the query (number of targets of Sacubitril and/or Valsartan) and the number of targets of the drugs used to calculate each ensemble is used to ponder the result of each ensemble. The higher the difference between the numbers of targets, the less weight the results for this ensemble of neural networks have in the final prediction calculation. The output is one node which corresponds to the relationship between a certain drug and its adverse effect (AEs) or indication (Yes-1 or No-0).
Sampling methods: This second strategy is used to describe all plausible relationship between sets of proteins previously identified with ANNs as suggested by experimental work, where each parameter corresponds to the relative weight of a link, connecting nodes (genes/proteins) in a graph (protein map). Thus, this approach does not provide a single solution, but rather identifies a universe of possible solutions that satisfy the biological restrictions of the truth table. However, not all solutions are used for the analysis. The accuracy is calculated by checking how much the models comply with the truth table, and it is defined as the percentage of true positives (correct predictions respect the knowledge stored in the truth table) of the mathematical solution respect the total of parameters to evaluate. The solutions used in subsequent analysis present accuracy higher than 95%. That is, only MoAs that are plausible from the standpoint of currently accepted scientific understanding were considered in the analysis. Once a response (in this case cardiac remodeling) is identified to a specific stimulus (Valsartan and/or Sacubitril), it is possible to analyze the molecular mechanisms that justify this association using the sampling methods strategy (Supplementary Fig.