Using a typical minimizing algorithm such as the
simplex algorithm
56 (link) to find a “best
fit” orientation,
penetration and Γ that minimizes χ
2 can be
a useful analysis technique but gives no information about other potential
solutions or confidence in the fitted parameters. Calculating how
the χ
2 value varies with orientation would give qualitative
information about which orientations better fit the data but cannot
easily be compared in terms of confidence. To solve this issue, Bayesian
analysis techniques such as Markov Chain Monte Carlo (MCMC) analysis
are useful for investigating the probability distribution of the parameters.
In brief, MCMC uses an iterative process to explore the parameter
space such that each step samples the posterior probability distribution
of the parameters.
57 (link) With a large chain,
the distribution of the chain steps should approximate the posterior
probability distribution of the parameters and so the density of the
resulting sample points provides an estimate of the probability density
function (PDF) for the parameters. Thus, by analyzing the distribution
of the accepted θ values against ϕ values, the probability
of the orientation can be investigated, with regions of high probability
corresponding to regions of high likelihood, that is, the best fitting
regions. The PDF can be estimated by binning the points into a histogram
or calculating the kernel density estimate (KDE). A bin width of 5
degrees was used for the histograms, while KDEs were calculated using
an adaptive bandwidth diffusion technique.
58 (link) Credible intervals for orientation were then found by calculating
the highest posterior density regions (HPDR) of the aforementioned
KDE, the smallest bounded area which contains the desired probability,
for example, 65% of the probability volume. As Γ and penetration
are univariate parameters distributed roughly in a Gaussian, the confidence
intervals were calculated by taking the standard deviation.
Bayesian probability analysis methods use a prior distribution,
which represents what is believed to be the most likely distribution
of the parameters. In this study, MCMC analyses were performed using
a uniform prior, to avoid making strong assumptions about the distribution
of the orientation, which were likely to be non-Gaussian. A delayed
rejection adaptive metropolis
59 (link),60 algorithm was used
to improve chain convergence and exploration of the parameter space.
Due to the rotational symmetry of the problem, the algorithm was modified
to include periodic boundary conditions for θ and ϕ, allowing
the Markov Chain to wrap around the opposite limit; for example. a
step of θ = 3° could travel from θ = – 179
to θ = + 178. This change greatly improves chain mixing and
reduces autocorrelation in the results. For models of the Fab and
Fc fragments, each MCMC simulation was run for 5 repeats of 200,0000
steps with 200,000 burn-in steps, while for COE-3 due to the high
proportion of rejected steps, each MCMC simulation was run for 5 repeats
of 16,000,000 steps with 400,000 burn-in steps.
Figure S5 shows the MCMC traces for the chains, while
Figure S6 shows plots demonstrating the autocorrelation
of the same parameters.
Figure S7 shows
the traces and autocorrelation for the measurement of 50 ppm Fc adsorbed
at the air/water interface, shown separately for reasons discussed
below in
Section 3.2 and the
Supporting Information.
While NR data is fitted to a single protein orientation in this
work, it is of course possible that there are multiple preferred orientations.
With sufficient neutron contrasts involving deuterated proteins or
with intelligent experimental design, it is possible to distinguish
such states. However, due to the vastly increased complexity of such
a model, this is beyond the scope of this study.
Ruane S., Li Z., Hollowell P., Hughes A., Warwicker J., Webster J.R., van der Walle C.F., Kalonia C, & Lu J.R. (2023). Investigating the Orientation of an Interfacially Adsorbed Monoclonal Antibody and Its Fragments Using Neutron Reflection. Molecular Pharmaceutics, 20(3), 1643-1656.