So far, the determined residues consist solely of C α atoms. A complete protein backbone also consists of carbon, nitrogen, and oxygen atoms. Previous research has introduced various methods for reconstruction of a protein backbone from a reduced representation, such as one that contains only C α atoms (28 (link)). Instead of employing these theoretical methods, we chose to implement our own backbone reconstruction method to make use of the information captured from the 3D cryo-EM maps. This section presents our all-atom backbone reconstruction method. This is necessary for the next step in the pipeline, resolving the side-chain atoms.
In addition to C α prediction, the U-Net also provides information about carbon and nitrogen atoms in the confidence map predicted by the U-Net. We can use this information in combination with the previously determined C α atom positions to place the carbon and nitrogen atoms. Between the C α atoms of two connected amino acids, there is always a nitrogen and a carbon atom. Therefore, we can guess the initial position of these atoms by calculating the vector from one C α atom to the other and then placing the nitrogen and carbon atoms at one-third and two-thirds of the distance of this vector. To refine these initial positions, we calculated the center of mass around them in the carbon and nitrogen confidence maps. In Fig. 7A, we can see an example for the initial and refined placement of the carbon and nitrogen atoms.
After the initial refinement, we can further refine the positions of the carbon and nitrogen atoms by applying well-known molecular mechanics of a peptide chain. We made several assumptions about the positions of carbon, nitrogen, and oxygen atoms relative to the C α atoms, as seen in Fig. 7B. First, we assumed the planar peptide geometry in which the C α atom and carbon atom in the carbonyl group of an amino acid are in the same plane as the next amino acid’s nitrogen and C α atom (29 (link)). Second, we constructed a virtual bond between the neighboring C α atoms. The angles between this bond and the C α(i)C(i) bond ( θ2 ) and between this bond and the C α(i+1)N(i+1) bond ( ϕ2 ) are 20. 9° and 14. 9° , respectively (29 (link)). Third, the peptide bonds in a protein are in the stable trans configuration (30 ).
To refine the position of the carbon atoms, we relied on the previous refinement. Let us call the unit vector pointing from C α(i) to C(i)refinedv1 , the unit vector pointing from C α(i) to C(i)v2 , and the unit vector pointing from C α(i) to C α(i+1)v3 . v1=<a1,a2,a3>v2=<b1,b2,b3>v3=<c1,c2,c3>. The goal is to solve for the components of v1 . Due to the planar peptide geometry, v1 , v2 , and v3 exist in the same plane. Thus, their triple product equals zero. v1×(v2v3)=0 ora1(b2c3b3c2)a2(b1c3b3c1)+a3(b1c2b2c1)=0. From this relation and the cross-product of v1 and v2 , and that of v2 , v3 , we can construct a system of equations, a1b1+a2b2+a3b3=cos(θ2θ1)a1c1+a2c2+a3c3=cos(θ2)a1(b2c3b3c2)a2(b1c3b3c1)+a3(b1c2b2c1)=0. Solving this system of equations yields a1 , a2 and a3 . Next, the vector v1 is scaled appropriately to resolve the new position of the carbon atom. The position of the nitrogen atom is refined in a similar manner.
To determine the location of the oxygen atom in the carbonyl group, we assumed the coplanar relationship between the oxygen, C α , carbon, and nitrogen atom (29 (link)), and that the angles AαCO and AOCN (Fig. 7C) are approximately identical. We then derived a unit vector pointing in the direction of the C–O bond and scaled it with the C–O bond length to get the position of the oxygen atom.
Free full text: Click here