The current study focuses on obtaining parameters of phosphates and phosphate derivatives (test molecules are shown in 1). Methyl and p-tolyl phosphate served as surrogates for the biologically important phosphoserine and phosphotyrosine residues, respectively. The diesters serve as model compounds for the phosphate groups in DNA and RNA. The parameters for all-atom simulations of the phosphoric acid esters were determined and refined using a thermodynamic cycle (2), sometimes called a “Pearson cycle”.16 Eight energy terms contribute : (1) the gas phase basicities; (2) the energy that arises (if needed) from changing units from standard concentration to the units of liquid state measurements; (3) and (4) are the polarization energies arising from changing the charge distribution (in the gas phase) to values used in the force field, which are aimed at being appropriate to the condensed phase17 –19 ; (5) and (6) are the solvation free energies that can be obtained from fixed charge free-energy simulations (we use thermodynamic integration here); (7) is the standard proton free energy of hydration; and (8) is the free energy calculated from the experimental pKa value. Details of how the various terms were estimated are given next.

The gas phase basicities of the model compounds were computed using the Gaussian03 software package.20 For each structure of the model compounds studied the optimal gas phase geometry was obtained using density functional theory (DFT) approximating the exchange-correlation energy functional with the hybrid functional (B3LYP) at an aug-cc-pVTZ basis set level. This pre-optimized structure was used as input geometry to compute gas phase basicities. Molecular energies were computed using Gaussian-3 (G3) theory.21 The general equation of basicity is given by ΔGBasicity=ΔG(G3)+GH+RTΔlnmn, namely the sum of the free energy difference for the acid and base, the proton free energy and a multiplicity term that stems from the indistinguishable microscopic protonation states possible when placing r protons at s possible sites of each test compound. Each number (n,m) can be computed from the binomial coefficient, i.e. n=s!(rs)!r!. The symbol “Δ” denotes the difference between the acid and its conjugate base. The proton free energy in the gas-phase is given by the sum of ideal enthalpy (52RT) including the pV term and the proton entropy S=26.02 kcal/mol K at 298 K, obtained from the Sackur-Tetrode equation,22 leading to a proton free energy GH+ = –6.28 kcal/mol.23 (link)

Gas-phase polarization energies (3) and (4) were obtained using Gaussian03 and a polarizable continuum solvent model, using B3LYP, the aug-cc-pVTZ basis set and an external dielectric of 78.4, representing aqueous solution. Standard additional parameters of ε = 1.78 for the fast-response dielectric constant of water and Rsolv = 1.385 Å for the probe radius have been used. The total energy of a compound was computed with the PCM term, additionally the energy was calculated without PCM but using the fixed orbital coefficients of the previous calculation. Comparing the second energy to the result of a pure vacuum calculation provides an estimate of the polarization energy a compound builds up when its charge density adapts to an aqueous environment. This procedure employs equation 2 of Ref.18 (link) instead of the tensorial approach described there, so polarization energies were calculated here as EPol.=ESCF[ρPCM]ESCF[ρgas] where ρPCM is the electron density obtained from a calculation including the PCM solvation model and ρgas is the gas phase electron density. ESCF is the total energy of the compound calculated for a given density but without using the PCM model. Since the second term in Eq. 1 is always more negative than the first (since ρgas is the minimum energy density), a polarization energy is a positive energy contribution. We believe this approach to be a more robust calculation than the procedure of Swope, Horn and Rice18 (link) for the case of these small highly symmetrical compounds, since multipoles higher than quadrupoles may be required for a good description of their electrostatic properties. We note that the RESP charges used in our study are not guaranteed to match the charge distribution from PCM calculations which may degrade the quality of the polarization energy estimates. However, only differences of polarization energies enter the thermodynamic cycle calculations and these differences are small compared to the solvation energies of ions under consideration here. For all of the systems in 3 and 4, with the exception of of the dianion of the phosphotyrosine model, the differences in polarization energies are less than 4.5 kcal/mol. Note that the polarization term for the p-tolyl phosphate is significantly higher than for the other compounds. This can be understood in terms of the high resonance stability an aromatic ring substituent provides to a phosphate anion, indicating that the charge distribution is easiest to shift and therefore polarization plays a larger role for this one compound. Using different estimates for this term, or even ignoring it entirely, would not lead to significant changes in the resulting target values. Still, the optimized Lennard-Jones parameters would change somewhat, since they are strongly affected by small changes in energy. We have repeated the analysis presented in 4, 3 and 4 below without a polarization correction in the target values and obtained radii changes that were consistently higher, by about 0.07 Å. Clearly, a representation of polarization effects is necessary for accurate results here.

Solvation free energies (5) and (6) were obtained from all-atom simulations using the AM-BER10 molecular modeling suite10 using thermodynamic integration,24 (link) as follows. For MD simulations, charges for each molecule were taken from RESP fitting calculations.12 (link) We term the parameter set with these charges and all other parameters (like van-der-Waals radii) taken from the AMBER ff99 set the ”standard parameter“ set. The test molecules were solvated in an octahedal box of approximately 750 TIP3P water molecules so that 12.0 Å distance or more lay between every solute atom and the simulation box boundary. The systems were heated to a temperature of 300 K over 5 ps and equilibrated for 1 ns. Bond lengths including hydrogen atoms were constrained using the SHAKE algorithm.25 The electrostatic potential was evaluated using the Particle mesh Ewald (PME) method.26 ,27 The cutoff between short- and long-range interactions was taken to be 9.0 Å. The length of the time step was 2 fs. Free energies were evaluated using thermodynamic integration with linear (trapezoidal) interpolation of the free energy curve between 9 λ-values (0.1, 0.2, ..., 0.9), with a total simulation length of 3 ns per window, collecting data from the last 2 ns only. The total free energy for each compound was computed from three consecutive TI transformation steps, first a removal of all partial charges in solution, then removal of solute-solvent vdW interactions followed by reintroducing all partial charges in vacuum. Such a breakdown of TI calculations into individual substeps is common and helps overcome simulation instabilities. Error estimation for the resulting TI solvation free energies was conducted as in previous work,28 (link) by computing the ∂V/∂λ-autocorrelation time τ to estimate the standard error of the mean from the standard deviation σ∂V/∂λ for each simulation window as σSEM = σ∂V/∂λ/√tsim/2τ and combining individual λ-window results via Gaussian error propagation.

Values in the range of –249.5 to –264.0 kcal/mol based on various theoretical and experimental data have been suggested for the hydration free energy of the proton, (7).29 –31 (link) Basically, relative solvation free energies of cations (and of anions) can be determined from experimental measurements, but the absolute scale of cation values relative to anions requires some extra-thermodynamic assumptions. The situation is further complicated by the fact that in real systems, the reversible work required to move an ion from solution to the gas phase includes a contribution from the “phase potential” associated with the vacuum liquid interface.31 (link) The thermodynamic integration calculations used here correspond to an “intrinsic” or “absolute” transfer, which is independent of any phase potential. The cycle in 2 can be consistent with this if an “intrinsic” value is chosen for energy (7). Here, we use the result from polarizable force field ion simulations of Grossfield et al.,30 (link) –252.5 kcal/mol. This result is within 0.5 kcal/mol of the consensus value determined by Tissandier et al. , –252 kcal/mol (based on an estimated phase potential contribution), and is close to the value of -250 kcal/mol used for the current AMBER parameterization for alkali halide anions.32 (link) Lamoureux and Roux31 (link) have carried out polarizable ion simulations similar in spirit to those of Grossfield et al., arriving at a value of –247 kcal/mol. With this reference, cations are less favorably solvated, by about 5 kcal/mol, than with our choice, and anions such as phosphates would have solvation free energies that are more negative by the same amount. This change would have an effect on the optimized radii that can be estimated from 4 below. Our choice of –252.5 kcal/mol for energy (7) has the advantage of being consistent with the way other parts of the AMBER force fields have been developed.

With this thermodynamic cycle, we generated a refined parameter set by changing the phosphate oxygen van der Waals radii. Other refinement schemes are possible as well, such as charge redistribution within the phosphate group and changes to the hydrogen and phosphorus van der Waals parameters. Several such approaches were studied in preliminary work but did not yield closed thermodynamic cycles for chemically reasonable parameter values (data not shown). We introduce three new AMBER atom types for our test molecules, for the different types of oxygen atoms in phosphate groups, ’OP’ for a deprotonated phosphate oxygen, ’OQ’ for a protonated phosphate oxygen (a P-OH group) and ’OR’ for a bridging oxygen (e.g. a P-O-Me group). The types correspond to the ’O2’, ’OH’ and ’OS’ types in the AMBER forcefield. The different oxygen radii are summarized in 1.