This study applied
the fuzzy oil drop (FOD) model.
14 (link),15 (link) A short description
is presented here to facilitate the interpretation of the results
obtained by using the model. The basic assumption introduces treating
a protein structure as an effect of the micellization process. Amino
acids are bipolar molecules with a diverse polarity–hydrophobicity
relationship. Bipolar molecules in an aqueous environment form ordered
arrangements with a distribution characterized by a hydrophilic surface,
with hydrophobic residues concentrated in the center. The hydrophobicity
distribution in such a system can be described by a three-dimensional
(3D) Gaussian function spanning the protein molecule. The values of
the parameters (σ
X, σ
Y, and σ
Z;
eq 1) are adopted
to the size and shape of the molecule, thereby allowing the representation
of any globular form of the protein molecule.
The value of the
3D Gaussian function
at points representing consecutive amino acids (
i.e., positions of the effective atoms, which are the averaged positions
of the atoms comprising a given amino acid) expresses the idealistic
hydrophobicity level referred to in the model as the theoretical
Ti (
Figure 1A).
In exceptional cases, the structure of the protein
exactly reproduces
a distribution according to a 3D Gaussian function. The actual hydrophobicity
level of a given amino acid results from the intrinsic hydrophobicity
of each amino acid and its interaction with its neighbors (
rij: distance between effective
atoms). The Levitt function was used for the calculation of the observed
hydrophobicity level (
O)
16 (link) (
eq 2;
Figure 1A).
The
T and
O distributions after
normalization can be subjected to comparative analysis.
The
normalization is expressed by 1/
HsumT for the
T distribution
and 1/
HsumO for the
O distribution. The
Ti and
Oi are calculated for all
residues. To make them normalized, each of the compounds is divided
by the sum of all
Tij and
Oij. Quantitatively, the compatibility/incompatibility
of the
O distribution against the
T distribution (reference distribution) is expressed by divergence
entropy introduced by Kullback–Leibler
17 (link) (
eq 3). where
Pi is the distribution analyzed (
O distribution
in our model) and
Qi is
the reference distribution (
T in the FOD model).
However, the
DKL value (entropy) cannot
be interpreted. Therefore, another reference distribution (
R) (
Figure 1A) was introduced in which each amino acid represents the same level
of hydrophobicity equal to 1/
N, where
N is the number of amino acids in the structural unit under consideration.
The
DKL value is determined for the
relationship of the
O distribution toward the
R distribution (the
O|
R relationship). A comparison of these two
DKL values indicates the similarity of the
O distribution to one of the two reference distributions (
T and
R). A smaller
DKL value indicates the similarity of the compared distributions.
The relative distance (RD) value of a protein is expressed as follows where RD < 0.5 is interpreted as a protein
with a hydrophobic core (
Figure 1B).
A protein composed of amino acids linked
by covalent bonds (peptide
bonds) has limited possibilities (limited mobility) to reproduce the
micelle structure. The degree of adaptation of the
O distribution toward the
T distribution appears
to vary. Down-hill, fast-folding, ultrafast-folding proteins represent
a status with very low RD values;
18 (link) enzymes,
whose structure requires a substrate-binding cavity, show a local
mismatch in the form of local hydrophobicity;
19 –21 (link) the complexation
area can be recognized as a local hydrophobicity excess.
22 (link),23 (link) The elimination of residues of highest discrepancy between the
Oi and
Ti values
enables the identification of a moiety fulfilling the condition of
RD < 0.5, indicating that this is responsible for solubility. The
local discrepancies may also be recognized as a potential drug-binding
locus.
24 (link)Water is not the only milieu
for protein activity. The exposure
of hydrophobic residues on the surface is required for the stability
of the membrane-anchored protein, which is opposite to the water environment.
This opposite distribution of hydrophobicity in membrane proteins
can be expressed as a function complementing to 1. In practice, the
opposite distribution is determined as follows (
Figure 2)
However, the omnipresence of water influences the
form of the hydrophobic
distribution in membrane proteins in the following form
In the formula, the
K parameter expresses
the
force with which the field originating from the water environment
is modified by factors other than water, including hydrophobic factors
(
Figures 1C and
2).
The graphical presentation of opposite
function and calculation
of
K parameter is shown in
Figures 1D and
2.
The
M distribution is calculated for the minimal
value of
DKL(
O|
M) (
Figures 1C and
2) to find the modified
T distribution, possibly the closest one with respect to the
O distribution. The
M distribution plays
the role of
T distribution in a nonaqueous environment
modified by other factors. The description of the protein according
to the final form of the FOD-modified (FOD-M) model is thus expressed
in terms of the RD and
K parameter values. The value
of the RD parameter expresses the degree of similarity/dissimilarity
of the
O distribution to the
T distribution
(automatically the
R distribution). The
K parameter, however, expresses the extent to which the nonaqueous
environment affects protein structure formation.
Proteins characterized
by low
K value s are listed
in.
25 (link),26 (link) A protein functioning in the periplasmic
space exhibits a structure described by the parameter value
K = 0.6.
27 (link) Membrane proteins
(
e.g., rhodopsin described by the value
K = 0.9) are described by
K value s > 1.0.
28 (link)–32 (link) The present paper discusses a protein, chaperonin, which can represent
a status with
K > 3. A high value of the
K parameter is also shown by a protein folded in the external
field environment provided by chaperonin.
Roterman I., Stapor K., Dułak D, & Konieczny L. (2024). External Force Field for Protein Folding in Chaperonins—Potential Application in In Silico Protein Folding. ACS Omega, 9(16), 18412-18428.