To maximize transferability of the parameters, multidimensional structure scans were employed to generate conformational diversity. For smaller side chains, grid scans in dihedral space were used to generate side chain variety, including both α and β backbone conformations for each side chain rotamer. Grid scans were generated for Val in one dimension, as it only has χ1, at an interval of 10°. Grids were generated for Asp, Asn, Cys, Phe, His (δ-, ɛ-, and doubly-protonated), Ile, Leu, Ser, Thr, and Trp in two dimensions, as they have χ1 and χ2, at intervals of 20°, yielding 324 structures per amino acid.
We were unable to exhaustively explore side chain conformational space side chains with more than two rotatable bonds. Tyrosine has 3 rotatable χ bonds, but dihedral space is reduced as 180° rotation of either the phenol (χ2) or of the hydroxyl produce the same effect when accounting for symmetry of the ring. We therefore fully scanned each tyrosine dihedral when the other two were at a stable rotamer defined as any instance of that value in the rotamer library for this amino acid, rounded to the nearest 10° and limiting χ2 to (−90°, 90°] to account for symmetry. Stable rotamers for the hydroxyl, not in the rotamer library, were inferred from the QM energy profiles discussed above. Stable rotamers were 180° or ±60° for χ1, ±30° or 90° for χ2, and 0° or 180° for the hydroxyl. Conformations were generated using a full scan for each dihedral (at 20° increments), repeated for every combination of stable rotamer values for the other two dihedrals. As protonated aspartate has nearly the same dihedrals as Tyr (χ1, χ2 and hydroxyl), it was scanned in the same manner, but without χ2 restriction because aspartate does not have the same symmetry properties.
Cysteine presents a special case, as it can form disulfide bonds that bridge two amino acids. In addition to developing parameters for reduced Cys (no disulfide), a pair of Cys dipeptides with a disulfide bond was employed to scan the S-S energy profile. However, a disulfide between CysA and CysB has a total of five dihedrals: χ1A, χ2A, χSS, χ2B, and χ1B. As full sampling across five dihedrals is clearly intractable, conformation space was reduced by applying the same χ1 / χ2 values to both dipeptides. Using this symmetry, a two-dimensional scan was performed for all χ1 / χ2 combinations using 20° spacing; this scan was repeated with χSS restrained to 180°, ±60°, or ±90° (five 2D scans). Separately, the χSS profile was scanned with 20° spacing using χ1 of 180° or ±60° and χ2 of 180° or ±60° (nine 1D scans total). As with the other amino acids, the entire procedure was repeated with the backbone in α and β conformations; here, both dipeptides adopted the same backbone conformation.
The remaining side chains, Arg+, Gln, Glu (protonated), Glu,Lys+, and Met, have at least three side chain dihedrals (Table S1). Rather than performing a grid search, MD simulations were used to generate diverse conformations of these side chains. Each dipeptide was simulated twice, with α or β backbone restraints, for 100 ns each. To overcome kinetic traps, these simulations were performed at 500 K and the dielectric was set to 4r. Next, a diverse subset was generated by mapping each conformation to a multidimensional grid spaced 10° in each χ. The five lowest energy conformations at each grid point were saved. From each simulation grid, five hundred structures were randomly selected (comparable to the number generated by the grid procedure described above for Tyr). Because the longer, more flexible side chains of these amino acids can adopt conformations with strong interactions between backbone and side chain, conformations where we suspected the in vacuo MM description may produce fitting artifacts were excluded, using electrostatic and distance cutoffs defined in the Supporting Information.