Peak lists (38 058 spectra) were searched with Mascot 2.2 using the following parameters: enzyme = trypsin (allowing for cleavage before proline27 (link)); maximum missed cleavages = 2; variable modifications = carbamidomethylation of cysteine, oxidation of methionine; product mass tolerance = 0.5 Da. The International Protein Index (IPI) database version 337 (Mus musculus) was used as a protein sequence database. Common external contaminants from cRAP (a maintained list of contaminants, laboratory proteins and protein standards provided through the Global Proteome Machine Organisation, http://www.thegpm.org/crap/index.html, were appended. The compounded database contained 51 355 sequences and 23 635 027 residues. For FDR assessment, a separate decoy database was generated from the protein sequence database using the decoy.pl Perl script provided by Matrix Science. This script randomizes each entry, but retains the average amino acid composition and length of the entries.
Data was searched at 100 ppm peptide mass tolerance to evaluate the mass accuracy of the data set. After a correction25 (link) of a systematic mass deviation of 3 ppm, 90% and 99% of all PSMs with a Mascot score greater than 30 fell within a ±5 and ±20 ppm mass window, respectively. For the most stringent mass tolerance settings where Mascot thresholds are most sensitive, the data was searched at 20 ppm. Moreover, data was also searched at 500 ppm peptide mass tolerance to enable mass accuracy filtering combined with the adjusted MHT (Adjusted Mascot Threshold, AMT25 (link)). The mass deviation filter was set to 5 ppm, which was shown to be the most effective filter setting in combination with the AMT (Supporting Information Figure 1).