Fitting of atomic models into cryoEM maps was performed using UCSF Chimera45 (link) and Coot18 (link),46 (link). We initially docked the MHV domain A structure (PDB 3R4D) and used a crystal structure of a bovine coronavirus domain A (PDB 4H14) to model the three-stranded β-sheet and the α-helix present on the viral membrane proximal side of the galectin-like domain. Next, the MERS-CoV domain B crystal structure (PDB 4KQZ) was also fit into the density, and rebuilt and refined using RosettaCM47 (link). Although we could accurately align the sequences corresponding to the core β-sheet of the MHV and MERS-CoV B domains, the ~100 residues forming the β-motif extension (residues 453-535, MERS-CoV/SARS-CoV receptor-binding moiety) could not be aligned with confidence. We used RosettaCM to build models of each of the 945 possible disulfide patterns into the density for domain B. For each disulfide arrangement, 50 models were generated, and there was a very clear energy signal for a single such arrangement (ED Fig.3k). Then, 1000 models with this disulfide arrangement were sampled, and the lowest energy model (using the Rosetta force field augmented with a fit-to-density score term) was selected. Due to the poor quality of the reconstruction at the apex of the S trimer, the confidence of the model is lowest for the segment corresponding to residues 453-535, as homology-modeling was used to fill in details missing in the map.
A backbone model was then manually built for the rest of the S polypeptide using Coot. Sequence register was assigned by visual inspection where side chain density was clearly visible. This initial hand built model was used as an initial model for Rosetta de novo20 (link). The Rosetta-derived model largely agreed with the hand-built model. Rosetta de novo successfully identified fragments allowing to anchor the sequence register for domains C and D as well as for helices α2125. Given these anchoring positions, RosettaCM47 (link) augmented with a novel density-guided model-growing protocol was able to rebuild domains C and D in full. The final model was refined by applying strict non-crystallographic symmetry constraints using Rosetta19 (link). Model refinement was performed using a training map corresponding to one of the two maps generated by the gold-standard refinement procedure in Relion. The second map (testing map) was used only for calculation of the FSC compared to the atomic model and preventing overfitting48 (link). The quality of the final model was analyzed with Molprobity49 (link). Structure analysis was assisted by the PISA50 (link) and DALI51 (link) servers. The sequence alignment was generated using MultAlin52 (link) and colored with ESPript53 (link). All figures were generated with UCSF Chimera45 (link).