Prediction of secreted proteins was performed using a custom bioinformatic pipeline (Figure 1) assessing the following combined sequence characteristics: (a) proteins were predicted as secreted if the presence of a signal peptide was detected with SignalP, with D-cutoff values set to “sensitive” (version 4.1; option eukaryotic; Petersen et al., 2011 (link)), and no transmembrane helix or one overlapping the signal peptide found by TMHMM using default parameters (version 2.0; Melén et al., 2003 (link)) and (b) protein subcellular localization. Proteins were considered as secreted if subcellular localization was assigned as a secretory pathway using TargetP with the –N option to exclude plants (version 1.1; Emanuelsson et al., 2000 (link)) and as extracellular with WolfPsort using the option “fungi” (version 0.2; Horton et al., 2007 (link)). To filter out proteins that permanently reside in the endoplasmic reticulum (ER) lumen, we scanned the proteins for the KDEL motif (Lys-Asp-Glu-Leu) in the C-terminal region (prosite accession “PS00014”) with PS-SCAN (version 1.79). Annotation of the secreted proteins was completed by a BLASTP query comparing protein sequences against different resources and specialized databases (evalue = 10−5 and choosing the best hit) using the followingdatabases: (1) CAZyme (http://www.cazy.org/), (2) MEROPS (http://merops.sanger.ac.uk/), and (3) Lipase Engineering Database (http://www.led.uni-stuttgart.de/) and the following international DNA databases: (1) Uniprot Swissprot and (2) JGI Mycocosm. We also performed domain searches with the HMMER package (version 3.0, default parameters; Finn et al., 2011 (link)) for PFAM domains. To predict whether the secreted proteins targeted nuclei, we used PredictNLS (default parameters, version 1.0.20; https://rostlab.org/owiki/index.php/PredictNLS) for determine the presence of a nuclear localization signal. We also estimated the percentage of cysteine and the KR-rich regions of the secreted proteins. We considered secretome proteins smaller than 300 amino acids as SSPs. Data mining and comparison and figure plotting have been performed using the R software (R Core Team, 2014 , http://www.R-project.org/) and an in-house Python script.
Free full text: Click here