Alignment of the CYP97 sequences from UniProt was performed through ClustalW [16 (link),17 (link)]. 3D models of CYP97H1 from Euglena gracilis were generated using Swiss-MODEL [18 (link)], YASARA [19 (link)] and AlphaFold2 [20 (link)] through CoLabFold [21 (link)], which ended up being the model used in this article. AlphaFold was initially used to calculate 5 models of the full-length sequence, but the first 184 and the last 20 residues were modelled with high disorder, due to a lack of sequence data in this region. Given this result, a truncated CYP97H1 sequence starting from G185 and ending at P716 was remodelled using the pdb70 template option, with 48 recycles and amber relaxation. The best model was superposed against the structure of A. thaliana CYP97A3 (PDB 6J95) (RMSD 0.85Å across 341 residues) and heme coordinates were then copied straight from the structure to give the final model for the study.
Idealised 3D coordinates of β-carotene, β-cryptoxanthin, rubixanthin and zeaxanthin were obtained from the NCBI PubChem website (https://pubchem.ncbi.nlm.nih.gov, (accessed on 9 June 2022) in SDF format. Both protein and substrate coordinates were processed using AutoDockTools [22 (link)] to add partial charges and polar hydrogens. Docking was performed using AutoDock VINA [23 (link),24 (link)] using a docking grid of x = 28 Å, y = 40 Å and z = 50 Å centred around the binding pocket, with a search exhaustiveness of 10. All results were viewed in PyMOL [25 ].
Free full text: Click here