Protein Homology/analogY Recognition Engine V 2.0 (Phyre2) tool was used for homology modeling of concerned PTB protein [43 (link)]. Following two attributes in protein selection for homology modeling were also kept considered for inclusion:
After following the above-mentioned inclusion criteria, CNN1 was selected for 3D homology modeling. The target sequence of Calponin1 (CNN1 Isoform 2) protein was searched from UniProt database (Entry No. P51911-2). PHYRE2 tool was used for homology modeling that consisted of sub-algorithmic stages:
Stage 1—Gathering Homologous Sequences: CNN1 isoform 2 sequence was scanned against the specially curated NR20 (No of sequences with >20% mutual sequence identity) protein sequence database with HHblits [44 ]. The resulting Multiple Sequence Alignment (MSA) was used to predict the secondary structure with PSI-blast based secondary structure PREDiction (PSIPRED) [45 (link)], and both the alignment and secondary structure prediction combined into a query Hidden Markov Model (HMM).
Stage 2—Fold Library Scanning: The models were scanned against a database of HMMs [46 ] of proteins of known structure. The top-scoring alignments from this search were used to construct crude backbone-only models.
Stage 3Loop Modeling: Indels in these models were corrected by loop modeling.
Stage 4—Side-chain Placement: Amino acid side chains were added to generate the final PHYRE2 models.
The best homology models were selected based on the top-ranked modeled structures (according to similar template pattern). The alignment description of templates obtained through manual protein-blast comparison with the highest percent identity at 100% confidence was generated by Phyre2 tool. The quality assessment of stereochemical properties of 3D homology models were carried out by PROCHECK [47 ]. Ramachandran plot and residual properties of constructed 3D model along with the dihedral angles of φ against ψ of all possible conformations of amino acids in protein structure had also been studied in Ramachandran plot [48 ]. The validation of 3D protein models were also performed by SAVES (Structure Validation Server: https://saves.mbi.ucla.edu/) web server [49 ] to know the probable structural errors and z-score of the chosen model. SMART-EMBL tool (http://smart.embl-heidelberg.de/) was used to compute the confidently predicted domains, repeats, motifs and low complexity region (LCR) in protein. The SuperPose webserver (http://superpose.wishartlab.com/) was used to calculate both sequence alignment between template and 3D homology model through structure superposition using modified quaternion eigenvalue approach to generate RMSD statistics of the superimposed molecules [50 (link)].
Free full text: Click here