AP-MS Data Preprocessing and Differential Analysis

The data were first filtered based on the label-free quantification intensities (LFQi) using the following five steps: (i) removal of proteins that were labeled as “only identified by site”, “potential contaminant”, and “reverse”; (ii) removal of all observations with LFQi equals to 0; (iii) removal of outlier samples (based on low overall LFQi; see Fig S3); (iv) removal of proteins that are not present in at least 60% of the samples of a group for each group (a group is defined as the collection of three biological with two technical replicates for one condition, which results in a group size of maximum 6); and (v) filtering against the negative control sample, which is only the beads used for the AP-MS sample preparations, by only considering proteins for further analysis that are significantly higher found in the samples compared with the negative control. In MS analysis–based proteomics data, there are typically two types of missing values, the missing not at random (MNAR) and the missing at random (MAR) (Lazar et al, 2016 (link)). A mixed imputation strategy was chosen, with kNN imputation as the strategy for MAR values (Gatto & Lilley, 2012 (link); Gatto et al, 2021 (link); Rainer et al, 2022 (link)). Other missing values were considered MNAR values and imputed at value 0. After the imputation, differential interaction analysis was performed for each group against the bead control. P-values were adjusted using FDR correction as described by Benjamini and Hochberg (1995) (link). Afterward, all proteins were extracted for each group, which were significantly enriched in the sample (cutoffs: P-value–adjusted: <0.01, log fold change: >1). The data were transformed to have consistent protein and gene name annotations after the data filtering. The data are received from MaxQuant software in UniProt IDs and mapped to HGNC gene names using the HGNC database (retrieved 12/2021). However, one UniProt ID can correspond to multiple HGNC gene names. In this case, manual selection of the gene names of interest was performed. Finally, the HGNC names were mapped to gene IDs of the SysGO database (Luthert & Kiel, 2020 (link)). A couple of proteins could not be found in the SysGO database, and one protein was renamed (i.e., HGNC name: PHB1, which was renamed PHD for SysGO). Then, the technical replicates were merged using the median. In summary, we obtain a dataset with raw LFQi (Table S2) or log₂-transformed (Table S3) data with biological triplicates. Data preparation was performed in R (http://www.r-project.org/index.html) using the following packages: dplyr (Beckerman et al, 2017 ), tidyr (Wickham et al, 2019 (link)), stringr (Wickham, 2010 (link)), tidyxl, purr (Mailund, 2019 ), DEP (Zhang et al, 2018 (link)), and limma (Ritchie et al, 2015 (link); Phipson et al, 2016 (link)). The script file for the data preparation and the data pre- and post-preparation are available on Zenodo (Camille et al, 2022 (link)).

Table S2. Raw AP-MS LFQ intensity data with biological triplicates.

Table S3. Log₂-transformed AP-MS data with biological triplicates.

Free full text: Click here

Ternet C., Junk P., Sevrin T., Catozzi S., Wåhlén E., Heldin J., Oliviero G., Wynne K, & Kiel C. (2023). Analysis of context-specific KRAS–effector (sub)complexes in Caco-2 cells. Life Science Alliance, 6(5), e202201670.

Publication 2023

A proteins Biological Gene Multiple gene Protein Proteins i

Corresponding Organization : University of Pavia

Other organizations : Uppsala University

Top 5 similar protocols

Variable analysis

independent variables

Label-free quantification intensities (LFQi)

dependent variables

Proteins significantly enriched in the samples compared to the negative control

control variables

Removal of proteins labeled as 'only identified by site', 'potential contaminant', and 'reverse'
Removal of all observations with LFQi equals to 0
Removal of outlier samples based on low overall LFQi
Removal of proteins not present in at least 60% of the samples in a group
Filtering against the negative control sample (beads used for the AP-MS sample preparations)
Imputation of missing values: kNN imputation for MAR values, value 0 for MNAR values

negative control

Beads used for the AP-MS sample preparations

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!