Geometry optimization and MEP computation were performed using the Gaussian (versions 98 and 03),76 , 77 GAMESS-US [versions 24 Mar 2007 (R3) and January 2009 (R1)],54 and the PC-GAMESS/Firefly (versions 7.1) programs,55 on a 1.67 GHz SGI Altix running the SUSE Linux Enterprise Server 10 operating system, an IBM RS6000 based cluster (AIX 5.2), R5000 and R12000 SGI workstations (IRIX 6.5.22), and/or PC Linux based workstations (Fedora 6.0, 8.0 and CentOS 5.2). RESP and ESP charge fitting was carried out using the RESP program.28 The latter program was modified and recompiled to slightly increase the charge accuracy as well as the maximum number of charges, Lagrange constraints and molecules allowed during the fitting step (the convergence criteria “qtol”, the maximum number of charge values “maxq”, the maximum number of lagrange constraints “maxlgr” and the maximum number of molecules “maxmol” were adjusted to 1.0d-5, 5000, 500 and 200, respectively). The HF method and the 6-31G* basis set were used to optimize molecular geometries.33 -35 MEP were computed based on two different approaches: using either (i) the HF/6-31G* theory level in the gas phase,28 , 29 or (ii) the density functional theory (DFT) method, the B3LYP exchange and correlation functionals, the IEFPCM continuum solvent model (ε = 4) to mimic organic solvent environment, and the cc-pVTZ basis set.48 -50 (
link) The HF/STO-3G theory level.34 , 36 , 37 was also tested to calculate MEP since it was used in ESP charge derivation for the Weiner
et al. force field.38 , 39 Both the CHELPG and Connolly surface algorithms used in MEP calculation were considered in this work.10 , 15 , 16 Charge derivation and building force field library reported here were carried out by the R.E.D. Tools. Initial structures were constructed using the LEaP or InsightII program.74 , 78 The corresponding optimized geometries and charge values were displayed using the LEaP or VMD program.74 , 79
More than fifty molecular systems have been considered in this work in order to demonstrate the different capabilities of the R.E.D. Tools. Considering the large amount of data generated only few characteristic results will be presented below. The entire set of data is summarized in the
Table S4 of the supplementary material, and is available in R.E.DD.B. It includes well-studied structures for which atomic charge values are known allowing for comparisons with published data and creating a benchmark. Several new molecular systems are also reported. The first group of studied structures includes organic molecules such as ethanol (
anti and
gauche+ conformations),29 , 43 , 47 (
link), 80 , 81 dimethylsulfoxide,81 -83 dimethylphosphate (
gauche+,
gauche+ conformation),40 , 59 trifluoroethanol (
anti and
gauche+ conformations),84 -86 (
link) methoxyethane (
anti and
gauche+ conformations),40 , 43 , 47 (
link), 80 , 87
N-methylacetamide (
cis and
trans conformations),28 , 29 , 40 , 43 , 80 , 87 , 88 1-4-dioxane (
chair and
twist-boat conformations),43 , 89 , 90 ethane-1,2-diol (
anti anti anti, anti gauche+
anti, gauche+
anti gauche-, gauche+
gauche- gauche+ and
gauche+
gauche+
gauche+ conformations),29 , 47 (
link), 80 methanol,25 , 28 , 29 , 40 , 47 (
link), 80 propanone, ethanoic acid,43 , 80 acetonitrile,25 formamide,25 , 87 methanal,87 furane,87 pyrrole, benzene,40 , 80 toluene,80 chloroforme,81 cyclohexane (
chair and
twist-boat conformations).43 , 80 , 90 These molecules were involved in explicit solvent MD simulations and/or force field development in the past. The second group of structures studied consists of bio-molecules such as alanine dipeptide (
C5, C7ax and
C7eq conformations),21 , 25 , 40 , 80 , 91 as well as standard deoxyribonucleosides (
i. e. deoxyadenosine, deoxycytidine, deoxyguanosine and thymidine in the
C2’-endo and
C3’-endo conformations) and ribonucleosides (
i. e. adenosine, cytidine, guanosine and uridine in the
C3’-endo conformation).40 , 42 (
link) Finally, following the strategy proposed by Cieplak
et al.,59 charge derivation and force field library building were carried out for various molecular fragments of unusual amino acids as well as for standard nucleic acid nucleotides. The central, H3N(+)-terminal, (-)O2C-terminal molecular fragments (as well as terminal neutral fragments) of alpha-aminoisobutyric acid92 (
link), 93 (
link) and
O-methyl-
L-tyrosine residues94 (
link) were generated using the corresponding
N-acetyl
N’-methylamide amino acid (with φ, ψ dihedral angles characteristic for the α-helix and/or β-sheet secondary structures), methylammonium and acetate. The central,
5’-terminal and
3’-terminal fragments of standard nucleic acid nucleotides for DNA and RNA were obtained using dimethylphosphate (
gauche+,
gauche+ conformation), the four deoxyribonucleosides (
C2’-endo and
C3’-endo conformations) and/or four ribonucleosides (
C3’-endo conformation).
In the following section of the article we will discuss reproducibility of the RESP or ESP charge models. We will first compare the charge values of single conformation molecular systems determined by the same QM program,
i. e. using either Gaussian or GAMESS-US. For every molecule, geometry optimization was performed using four different sets of initial Cartesian coordinates selected randomly. Computation of MEP and derivation of charge values were carried out using each optimized Cartesian coordinate set. We will compare results obtained using the Gaussian and GAMESS-US programs. In this context, the role of
ab initio threshold criteria during geometry optimization, and the impact of different optimized geometry re-orientation procedures, available in both programs, on the charge values will be addressed. Finally, we will discuss a rigid-body re-orientation algorithm based on the selection of any three atoms, which has been implemented in the R.E.D. source code to provide a general method for reorienting optimized geometries before MEP computation. This approach is independent of the QM program used for calculations. According to this strategy, the first selected atom is translated to the origin of axes, the first two atoms define the (O, X) axis while the third one is used to define the (O, X, Y) plane. The (O, Z) axis is automatically set as the cross-product between the (O, X) and (O, Y) axes.71 , 72 This approach can be used for every optimized molecular geometry, and is the basis for multiple orientation charge fitting.
In the last section, we will demonstrate how multiple orientation and multiple conformation can be combined together during charge derivation. The R.E.D. program provides easy setup for handling MEP computation (using either the Connolly surface or the CHELPG algorithm), single ESP stage, single RESP stage as well as two RESP stage fitting, which makes it an efficient tool for comparing various charge models. In addition, the introduction of intra-molecular charge constraint(s) during charge fit extends the number of charge models and allows building force field libraries of molecular fragments in a similar way as it is done for the central fragment of an amino acid employed for building polypeptide chains.59 Examples of charge derivation involving multiple orientations, multiple conformations and multiple molecules will be then described. Including more than one molecule in charge derivation and introducing inter-molecular charge constraints and inter-molecular charge equivalencing during the charge fitting allows determining atomic charges for a large variety of molecular fragments.47 (
link), 59 Thus, inter-molecular charge constraints can be used for defining molecular fusion between two molecules by eliminating groups of atoms with zero sum of charge. This approach is applied in the process of an automatic generation of the force field libraries of molecular fragments, from which larger systems can be built, and is a standard strategy for creating libraries of the central and terminal amino acid and nucleotide fragments. By analogy, this method can be directly extended to other biomolecular systems such as oligosaccharides, glycoconjugates as well as bio-inorganic complexes. In addition to the above features, R.E.D. is capable of generating all-atom or united-carbon atom charge models, and create appropriate force field libraries, which can be readily used for validation in MD simulations.38 , 40 , 59 , 73 (
link) The simultaneous formation of an ensemble of force field libraries for a family of structures or FFTopDB is then presented and discussed using standard nucleic acids as an example.
Dupradeau F.Y., Pigache A., Zaffran T., Savineau C., Lelong R., Grivel N., Lelong D., Rosanski W, & Cieplak P. (2010). The R.E.D. Tools: Advances in RESP and ESP charge derivation and force field library building. Physical chemistry chemical physics : PCCP, 12(28), 7821-7839.