Deconvolving Cell Type Fractions from Bulk Transcriptomics

A number of computational methods have been proposed to infer cell type abundance, cell type-specific GEPs, or both from bulk tissue expression profiles^{2 (link)–8 (link)}. These methods generally assume that biological mixture samples can be modeled as a system of linear equations, where a single mixture transcriptome m with n genes is represented as the product of H and f, where H represents an n × c cell type expression matrix consisting of expression profiles for the same n genes across c distinct cell types, and f represents a vector of size c, consisting of cell type mixing proportions.
To infer cell type abundance using this linear model within CIBERSORTx, let M be an n × k matrix with n genes and k mixture GEPs, let matrix B be a subset of H containing discriminatory marker genes for each of the c cell subsets (i.e., signature or basis matrix^{15 (link),74 (link),75 (link)}), and let M’ be the subset of M that contains the same marker genes as B. Given M’ and B, the following equation can then be used to impute F, a c × k fractional abundance matrix with columns [f₁,f₂,…,f_k]:

B \times F_{•, j} = M_{•, j}^{'}, 1 \leq j \leq k

where F_i,j≥ 0 for all i, j, the system is overdetermined (i.e., n > c), and expression data in M’ and B are represented in non-log linear space⁷⁶. (Note that M_i,• and M_•,j denote row i and column j of matrix M, respectively). Many methods either normalize F or impose an additional constraint on F such that for each mixture sample, the inferred mixing coefficients sum to one, allowing F to be directly interpreted as cell type proportions (with respect to the cell subsets in B)^{3 (link)}. We previously introduced CIBERSORT as a method to estimate F using an implementation of ν-support vector regression, a machine learning technique that is robust to noise, unknown mixture content, and collinearity among cell type reference profiles^{15 (link)}. CIBERSORT was used to impute F in this work, and within this imputation workflow, the batch correction scheme described below was used for all cross-platform analyses, unless stated otherwise (Supplementary Table 1).

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Newman A.M., Steen C.B., Liu C.L., Gentles A.J., Chaudhuri A.A., Scherer F., Khodadoust M.S., Esfahani M.S., Luca B.A., Steiner D., Diehn M, & Alizadeh A.A. (2019). Determining cell-type abundance and expression from bulk tissues with digital cytometry. Nature biotechnology, 37(7), 773-782.

Publication 2019

A genes Biological Bulk Cell matrix Cell type Genes Genes marker Matrix m Modeled system Tissue Transcriptome Vector

Corresponding Organization : Stanford University

Top 5 similar protocols

Protocol cited in 97 other protocols

Variable analysis

independent variables

Matrix B containing discriminatory marker genes for each of the c cell subsets (i.e., signature or basis matrix)

dependent variables

Fractional abundance matrix F, a c x k matrix with columns [f1, f2, ..., fk]

control variables

Expression data in M' and B are represented in non-log linear space
The system is overdetermined (i.e., n > c)

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!