Input PDB files can contain numerous errors and format inconsistencies, such as missing heavy atoms, suboptimal residue conformations and non-standard atom names. H++ attempts to make automatic, albeit conservative corrections for many of these problems when possible. Otherwise the errors are identified for possible manual correction. For example, the N and O atoms in the amide groups of ASN and GLN, and the N and C atoms in the imidazole ring of HIS cannot be easily distinguished from electron density maps. Thus, the assignment of these atoms in the PDB file may be (optionally) ‘flipped’ using the reduce algorithm that is based on an analysis of van der Waals contacts and H-bonding (26 (link)). An example of errors that are identified for manual correction are missing residues in the middle of protein chains. Input PDB files may also contain HETATM entries for solvent and ligand molecules; H++ removes these entries. Solvent molecules are removed by default because they are treated implicitly by the continuum solvent methodology used. Non-protein ligands are removed by default, but an option is now available to manually include many ligands and specific buried water molecules for processing, as described on the H++ site. Inclusion of buried waters has been shown to improve the accuracy of computed pK of nearby groups (27 (link)). For peptide, protein, DNA and RNA ligands, current AMBER force field parameters are used to add H atoms and assign atomic partial charges. For other organic ligands, H++ uses OpenBabel (28 ) to add H atoms, and atomic partial charges are assigned using the antechamber module from AmberTools (29 (link)) and the generalized AMBER force field (GAFF) parameter set. PDB structures may also contain residues with partial occupancy representing multiple possible conformations. Without manual intervention from the user, H++ selects the ‘A’ conformation and ignores all others.
An input PQR file, on the other hand, is assumed to have already been validated (e.g. in order to compute the atomic charges and radii included in the PQR file). Therefore, most error and consistency checks are bypassed for input PQR files. In addition, H++ requires AMBER compatible atom and residue names in the input PQR file.