data acquisition, Thermo RAW files were processed using
a series of software tools that were developed in-house. First the
RAW files were converted to mzXML using a custom version of ReAdW.exe
(
had been modified to export ion accumulation times and FT peak noise.
During this initial processing we also corrected any erroneous assignments
of monoisotopic m/z. Using Sequest,24 (link) MS2 spectra were searched against the human
UniProt database (downloaded on 08/02/2011), supplemented with the
sequences of common contaminating proteins such as trypsin. This forward
database was followed by a decoy component, which included all target
protein sequences in reversed order.
Searches were performed
using a 50 ppm precursor ion tolerance.25 (link) When searching Orbitrap MS2 data, we used 0.02 Th fragment ion tolerance.
The fragment ion tolerance was set to 1.0 Th when searching ITMS2
data. Only peptide sequences with both termini consistent with the
protease specificity of LysC were considered in the database search,
and up to two missed cleavages were accepted. TMT tags on lysine residues
and peptide N-termini (+ 229.162932 Da) and carbamidomethylation of
cysteine residues (+ 57.02146 Da) were set as static modifications,
while oxidation of methionine residues (+ 15.99492 Da) was treated
as a variable modification. An MS2 spectral assignment false discovery
rate of less than 1% was achieved by applying the target-decoy strategy.26 (link) Filtering was performed using linear discriminant
analysis as described previously27 (link) to create
one composite score from the following peptide ion and MS2 spectra
properties: Sequest parameters XCorr and unique ΔCn, peptide
length and charge state, and precursor ion mass accuracy. The resulting
discriminant scores were used to sort peptides prior to filtering
to a 1% FDR, and the probability that each peptide-spectral-match
was correct was calculated using the posterior error histogram.
Following spectral assignment, peptides were assembled into proteins
and proteins were further filtered based on the combined probabilities
of their constituent peptides to a final FDR of 1%. In cases of redundancy,
shared peptides were assigned to the protein sequence with the most
matching peptides, thus adhering to principles of parsimony.28