BLASTp (ncbi-blast software ver. 2.2.24) was used to identify homologous proteins among the three yeast species. Protein homologs were identified based on stringent cutoff values (E-values < 10-40), and on the score to sequence length ratio according to David et al. (2008) [31 (link)]. KEGG Ontology (KO) identifiers were also used to additionally infer reactions which could not be found in S. cerevisiae from the genome sequences of the two Pichia species following the RAVEN Toolbox pipeline. Finally, the metabolic network of S. cerevisiae iIN800 was used to map genes from P. pastoris and P. stipitis having homologs in S. cerevisiae.
Subcellular compartmentalization of reactions was determined using the F-LocA (Fully-connected Localization Assignment), which is part of the RAVEN Toolbox. F-LocA incorporates subcellular localization predictors (CELLO and WoLFPSORT) [32 (link)], together with a constraint on network connectivity. Reactions without associated genes were compartmentalized according to biochemical evidence when available. It is important to note that these automated approaches were only used as an aid in the reconstructions, and that biochemical and physiological evidence was always used to validate reaction localizations and gene associations. This was of particular importance in the peroxisomal metabolism where the predictive capability is lower due to the low quality of data from subcellular localization predictors (e.g. CELLO predicts that AOX is in the cytosol, but it is in the peroxisome). In cases where information about P. pastoris or P. stipitis was lacking, data from other closely related yeasts was used instead (e.g. S. cerevisiae, Hansenula polymorpha, Candida tropicalis, C. shehatea, and C. boidinii [33 (link),34 (link)]). Both GEMs are available in the BioMet Toolbox [