We construct a local coordinate system in the center of the six-membered rings, as shown in Figure 1a . Following this definition, the relative position and orientation between two nucleobases is described by a vector , that is conveniently expressed in cylindrical coordinates ρ, θ and (Figure 1b ). Note that is invariant for rotations around the axis connecting the six-membered rings. We highlight that this definition is similar to the local referentials introduced by Gendron and Major (25 (link)). The use of a nucleotide-independent centroid makes it straightforward to compare and combine collection of position vectors deriving from different combinations of nucleobases. This is of particular importance for constructing the knowledge-based scoring function (see below). The position vector has an intuitive interpretation in terms of base-stacking and base-pairing interactions. This aspect is illustrated in Figure 1c , that shows the distribution of vectors for all neighboring bases in the crystal structure of the Haloarcula marismortui large ribosomal subunit (PDB code 1S72) (2 (link)) projected on the ρ and coordinates. In the figure, different colors correspond to different types of interactions detected by MC-annotate. Due to steric hindrance, no points are observed in a forbidden ellipsoidal region. Furthermore, almost all the base-stacking and base-pairing interactions (≈99.6%) belong to a well-defined ellipsoidal shell. It is therefore useful to introduce the anisotropic position vector
with a = 5 Å and b = 3 Å, so that pairs of bases in the interaction shell are such that . The majority of base–base contacts lying in this interaction shell are annotated either as Watson-Crick/non-Watson-Crick or as base stacking, as detailed in Table1 . Within this region we distinguish a pairing zone and a stacking zone, according to the type of featured interactions. The tri-modal histogram in Figure 1c shows that these two zones can be defined without ambiguity considering pairs such that the projection of along the axis is larger (stacking) or smaller (pairing) than 2 Å.
It is well known that the strength and nature of pairing and stacking interactions depend on the base–base distance, on the angle θ as well as on other angular parameters (e.g. twist, roll, tilt) in a non-trivial manner (24 (link),33 (link)). Such dependence can be observed in Figure2 , where the points belonging to the pairing and stacking zone of Figure 1c are projected on two separate ρ–θ planes. These distributions give an average picture containing contributions from different base pair types (purine–purine, purine–pyrimidine and pyrimidine–pyrimidine) and with weights dictated by the employed data set. Nevertheless, the observations below hold also when considering the 16 possible combinations of base pairs individually and different data sets (see Supporting Data (SD) Figrues S1–S3). In the pairing zone (Figure 2 , left panel) we first observe a dominant peak centered around (ρ=5.6 Å, θ=60°), corresponding to the position of canonical Watson-Crick base pairs as well as wobble (GU) base pairs. The two other peaks correspond instead to base pairs interacting through the Hoogsteen or sugar edge (13 (link)). One can also appreciate the absence of bases in the region occupied by the sugar (190° < θ < 290°). The probability distribution in the stacking zone (Figure 2 , right panel) shows a broad peak in the proximity of the origin and extending up to ρ ≈ 4 Å, which can be compared to the typical radius of the six-membered ring (≈1.4 Å). This means that partial or negligible ring overlap is very frequent in RNA structures, as also observed in a seminal paper by Bugg et al. (34 (link)). This feature is more evident in pyrimidine–pyrimidine and purine–purine pairs, for which high overlap is the exception rather than the rule (see Supplementry Figure S3), whereas overlap is systematically observed in pyrimidine–purine pairs. The fact that bases in the stacking zone are very often ‘imbricated,’ similarly to roof tiles, rather than literally stacked one on top of the other, does not imply that they are not interacting. Indeed, base–base interaction is not limited to π–π stacking but also includes electrostatic effects, London dispersion attraction, short range repulsion as well as backbone-induced effects (35 (link)).
with a = 5 Å and b = 3 Å, so that pairs of bases in the interaction shell are such that . The majority of base–base contacts lying in this interaction shell are annotated either as Watson-Crick/non-Watson-Crick or as base stacking, as detailed in Table
It is well known that the strength and nature of pairing and stacking interactions depend on the base–base distance, on the angle θ as well as on other angular parameters (e.g. twist, roll, tilt) in a non-trivial manner (24 (link),33 (link)). Such dependence can be observed in Figure
Full text: Click here