Calculations. Both S&R and L&R are pretty straightforward to implement, and both require first determining which atoms are in contact, and then calculating the overlap between each atom and its neighbors. Finding contacts is done using cell lists, which means the contact calculation is an
O(
N) operation. Both algorithms then treat each atom independently, making also the second part of the calculation
O(
N). In addition, this second part is trivially parallelizable.
For L&R, instead of slicing the whole protein in one go, each atom is sliced individually. The L&R calculation is thus parameterized by the number of slices per atom, i.e. small atoms have thinner slices than large atoms.
The Fibonacci spiral gives a good approximation to an even distribution of points on the sphere (
Swinbank & Purser, 2006 (link)), allowing efficient generation of an arbitrary number of S&R test points. The cell lists provide the first of the two lattices in the double cubic lattice optimization for this algorithm (
Eisenhaber
et al., 1995
), the second lattice (for the test points) is not implemented in FreeSASA, for now.
The correctness of the implementations was tested by first inspecting the surfaces visually. In the two atom case, results were verified against analytical calculations. Another verification came from comparing the results of high precision SASA calculations using the two independent algorithms. In addition, using the L&R algorithm gives identical results to NACCESS when the same resolution and atomic radii are used.
Radius assignment. An important step of the calculation is assigning a radius to each atom. The default in FreeSASA is to use the
ProtOr radii by
Tsai
et al. (1999)
. The library recognizes the 20 standard amino acids (plus Sec and Pyl), and the standard nucleotides (plus a few nonstandard ones). Tsai
et al. do not mention phosphorus and selenium; these atoms are assigned a radius of 1.8 and 1.9 Å respectively.
By default, hydrogen atoms and HETATM records are ignored in Protein Data Bank (PDB) files. If included, the library recognizes three common HETATM entries: the acetyl and NH
2 capping groups, and water, and assigns ProtOr radii to these. Otherwise the van der Waals radius of the element is used, taken from the paper by
Mantina
et al. (2009)
. For elements outside of the 44 main group elements treated by Mantina
et al., or if completely different radii are desired, users can provide their own configuration.
Users can specify their own atomic radii either through the API or by providing a configuration file. The library ships with a few sample configuration files, including one that provides a subset of the NACCESS parameterization, and one with the default ProtOr parameters. In addition, scripts are provided to automatically generate ProtOr configurations from PDB CONECT entries, such as those in the Chemical Component Dictionary (
Westbrook
et al., 2015
). These can then be appended to the default configuration.
Free full text: Click here