We identified eight publicly available MHC class II prediction tools through literature search and the IMGT link list at http://imgt.cines.fr/textes/IMGTbloc-notes/. For each tool, we mapped the MHC types for which predictions could be made to the four-digit HLA nomenclature (e.g., HLA-DRB1*0101). If this mapping could not be done exactly, we left that type/tool combination out of the evaluation. For example, HLA-DR4 could refer to HLA-DRB1*0401, DRB1*0402 etc, which do have distinct binding specificities.
For the ARB evaluation, the 10-fold cross validation results stored at IEDB was used to estimate performance since ARB was trained on datasets overlapping with the one used in this study. For the other seven tools in the evaluation, we wrote python script wrappers to automate prediction retrieval. For the SYFPEITHI prediction, we patched each testing peptide with three Glycine residues at both ends before we submitted it for prediction. This was recommended by the creators of SYFPEITHI method to ensure that all potential binders are presented to the prediction algorithm. For all other methods, the original testing peptides were submitted directly for prediction. Peptide sequences were sent to the web servers one at a time and predictions were extracted from the server's response. To assign a single prediction for peptides longer than nine amino acids in the context of tools predicting the affinity of 9-mer core binding regions, we took the highest affinity prediction of all possible 9-mers within the longer peptide as the prediction result.
Free full text: Click here