To carry out the prediction, users can either upload a PDB file or provide a PDB ID, in which case PrankWeb will download and store the corresponding PDB file from the PDB database (30 (link)). In addition to selecting what protein to analyze, users can also specify whether evolutionary conservation should be included in the prediction process, which in turn determines which of the two pre-trained models will be used.
Conservation scores are calculated using the Jensen-Divergence method (31 (link)) from a multiple sequence alignment (MSA) file, which can come from three sources: (i) users can specify their own alignment file, (ii) if a protein’s PDB code is provided, PrankWeb uses MSA from the HSSP (32 (link)) database or (iii) where no MSA is provided and no MSA is found in HSSP, the MSA is computed using PrankWeb’s own conservation pipeline, which utilizes UniProt (33 ), PSI-Blast (34 (link)), MUSCLE (35 (link)) and CD-HIT (36 (link)). This process is depicted in Figure
After specification of the input, the submitted data is sent via a REST API to the server, which then starts the prediction pipeline. The user is provided with a URL address from which progress of the prediction process can be tracked and results inspected once the process finishes.
On the results page, PrankWeb utilizes LiteMol for visualization of 3D structural information and Protael for sequence visualization. Figure
PrankWeb consists of a Java backend, REST API and a Typescript frontend, the backend being based on the WildFly (37 ) web server and the P2Rank application, while the frontend uses the Protael, LiteMol and Bootstrap.js libraries to provide an interactive user interface on top of the REST API. All source code is available under Apache License 2.0 at GitHub (