In the following two sections, we describe how to create a custom leukocyte signature matrix and apply it to study cellular heterogeneity and TIL survival associations in melanoma tumors profiled by The Cancer Genome Atlas (TCGA). Readers can follow along by creating ‘LM6’, a leukocyte RNA-Seq signature matrix comprised of six peripheral blood immune subsets (B cells, CD8 T cells, CD4 T cells, NK cells, monocytes/macrophages, neutrophils; GSE60424 [20 ]). Key input files are provided on the CIBERSORT website (‘Menu>Download’).
A custom signature file can be created by uploading the Reference sample file and the Phenotype classes file (section 3.3.2) to the online CIBERSORT application (
SeeFigure 2) or can be created using the downloadable Java package. To build a custom gene signature matrix with the latter, the user should download the Java package from the CIBERSORT website and place all relevant files under the package folder. To link Java with R, run the following in R:
Within R:> library(Rserve)
> Rserve(args=“–no-save”)
Command line:> java -Xmx3g -Xms3g -jar CIBERSORT.jar -M Mixture_file -P Reference_sample_file -c phenotype_class_file -f
The last argument (-f) will eliminate non-hematopoietic genes from the signature matrix and is generally recommended for signature matrices tailored to leukocyte deconvolution. The user can also run this step on the website by choosing the corresponding reference sample file and phenotype class file (
seeFigure 2). The CIBERSORT website will generate a gene signature matrix located under ‘Uploaded Files’ for future download.
Following signature matrix creation, quality control measures should be taken to ensure robust performance (see ‘Calibration of in
silico TIL profiling methods’ in Newman et al.) [17 (
link)]. Factors that can adversely affect signature matrix performance include poor input data quality, significant deviations in gene expression between cell types that reside in different tissue compartments (e.g., blood versus tissue), and cell populations with statistically indistinguishable expression patterns. Manual filtering of poorly performing genes in the signature matrix (e.g., genes expressed highly in the tumor of interest) may improve performance.
To benchmark our custom leukocyte matrix (LM6), we compared it to LM22 using a set of TCGA lung squamous cell carcinoma tumors profiled by RNA-Seq and microarray (
n = 130 pairs). Deconvolution results were significantly correlated for all cell subsets shared between the two signature matrices (
P < 0.0001). Notably, since LM6 was derived from leukocytes isolated from peripheral blood [20 ,21 (
link)], we restricted the CD4 T cell comparison to naïve and resting memory CD4 T cells in LM22. Once validation is complete, a CIBERSORT signature matrix can be broadly applied to mixture samples as described in section 3.3 (e.g.,
SeeFigure 4).
Chen B., Khodadoust M.S., Liu C.L., Newman A.M, & Alizadeh A.A. (2018). Profiling tumor infiltrating immune cells with CIBERSORT. Methods in molecular biology (Clifton, N.J.), 1711, 243-259.