Fifty-two crystal structures that correspond to each of the four FGFR isoforms were analyzed: 32 FGFR1, 9 FGFR2, 3 FGFR3, and 8 FGFR4. PDBs with unresolved A-loops were removed from the analysis, as the A-loop was found to be an important part of the classification. Of the 208 conserved residues among all FGFR isoforms of the kinase domain, 192 of these residues were used in the analysis since some regions were consistently not resolved in crystal structures (e.g., P-loop and kinase insert). Using the 192 residues, a distance matrix of all PDBs was calculated using the minimum residue distance (MRD). PCA was then used on MRD of the 52 high-resolution crystal structures and plotted according to the PC1 and PC2. The first two PCs accounted for 66% and 7% of the total variance. From there, PDBs were clustered by the PC1 and PC2 using DBSCAN clustering algorithm (59 ) using Scikit-learn (60 ).
To determine the distances that account for the variance between clusters, a t-score was calculated as shown in Eq. 4, where diji and siji represent the average distance pair and the standard deviation of the distance within the cluster, respectively. t-score = dij1 - dij2sij1 + sij2
The t-score was then scaled (scaled t-score) by the minimum of the distance pair ( dij ) to enhance the t-score of smaller distances. scaled  t-score = t-scoremin(dij)
All distance pairs with a scaled t-score greater than 3.5 Å in magnitude and a minimum distance less than 4.5 Å were extracted. This yielded 43 distances formed in the active state, and 45 distances formed in the autoinhibited state. Next, distance pairs were filtered to represent unique contacts formed in the active and autoinhibited states. The first filtering was performed to remove long-range distances by removing distances that have an average PDB distance greater than 5 Å. The next filtering removed distances shared in active and autoinhibited states by removing average distances between the clusters that varied by 1 Å or less. Last, distances were filtered based on the stability of the contact in MD simulation. The stability of active and autoinhibited contacts was determined by the MD simulations starting from 2PVF and the FGFR2K homology model of 3KY2, respectively. Distances were removed if the percent contact formed was less than 25%, where a contact is defined as formed if the minimum residue heavy atom distance is less than 4.5 Å. Active distances removed include L647-L665, L647-P666, R625-L665, and K658-D677. Autoinhibited distances removed include T660-L665, R664-S702, R630-T660, R664-E695, R573-N662, and R573-R664. This yielded 20 active contacts formed in the active state but disrupted in the autoinhibited state, and 22 autoinhibited contacts formed in the autoinhibited state but disrupted in the active state (SI Appendix, Table S2). These contacts represented ~0.5% of the total number of conserved residue pairs