Integrative Single-Cell Analysis of Tumor-Infiltrating Lymphocytes and LCMV Infection

Prior to dataset integration, single-cell data from individual studies were filtered using TILPRED-1.0 (https://github.com/carmonalab/TILPRED), which removes cells not enriched in T cell markers (e.g., Cd2, Cd3d, Cd3e, Cd3g, Cd4, Cd8a, Cd8b1) and cells enriched in non-T cell genes (e.g., Spi1, Fcer1g, Csf1r, Cd19). Dataset integration was performed using STACAS¹⁴ (https://github.com/carmonalab/STACAS), a batch-correction algorithm based on Seurat^{12 (link)}. For the TIL reference map, we specified 600 variable genes per dataset, excluding cell cycling genes, mitochondrial, ribosomal, and non-coding genes, as well as genes expressed in <0.1% or >90% of the cells of a given dataset. For integration, a total of 800 variable genes were derived as the intersection of the 600 variable genes of individual datasets, prioritizing genes found in multiple datasets and, in case of draws, those derived from the largest datasets. We calculated pairwise dataset anchors using STACAS with default parameters, and filtered anchors using an anchor score threshold of 0.8. Integration was performed using the IntegrateData function in Seurat, providing the anchor set identified by STACAS, and a custom integration tree to initiate alignment from the largest and most heterogeneous datasets. Similarly, to construct the LCMV reference map, we split the datasets into five batches that displayed strong technical differences, and applied STACAS to mitigate their confounding effects. We computed 800 variable genes per batch, excluding cell cycling genes, ribosomal and mitochondrial genes, and computed pairwise anchors using 200 integration genes, and otherwise default STACAS parameters. Anchors were filtered at the default threshold 0.8 percentile, and integration was performed with the IntegrateData Seurat function with the guide tree suggested by STACAS.
Both for the TIL and LCMV atlases, we performed unsupervised clustering of the integrated cell embeddings using the Shared Nearest Neighbor (SNN) clustering method^{64 (link)} implemented in Seurat with parameters {resolution = 0.6, reduction = “umap”, k.param = 20} for the TIL atlas and {resolution = 0.4, reduction = “pca”, k.param = 20} for the LCMV atlas. We then manually annotated individual clusters (merging clusters when necessary) based on several criteria: (i) average expression of key marker genes in individual clusters; (ii) gradients of gene expression over the UMAP representation of the reference map; (iii) gene-set enrichment analysis to identify over- and under- expressed genes per cluster using MAST^{65 (link)}. In order to have access to predictive methods for UMAP, we recomputed PCA and UMAP embeddings independently of Seurat using respectively the prcomp function from basic R package “stats”, and the “umap” R package (https://github.com/tkonopka/umap).

Free full text: Click here

Andreatta M., Corria-Osorio J., Müller S., Cubas R., Coukos G, & Carmona S.J. (2021). Interpretation of T cell states from single-cell transcriptomics data using reference atlases. Nature Communications, 12, 2965.

Publication 2021

A genes Cell Cell cycling genes Csf1r Gene Gene expression Genes cluster Heterogeneous Lcmv Mitochondrial Mitochondrial genes Ribosomal Single individual Spi1 T cell Tree

Corresponding Organization :

Other organizations : Ludwig Cancer Research, University of Lausanne, SIB Swiss Institute of Bioinformatics

Top 5 similar protocols

Protocol cited in 19 other protocols

Variable analysis

independent variables

Batch correction algorithm STACAS
Integration of single-cell data using STACAS
Selection of variable genes for dataset integration
Unsupervised clustering of integrated cell embeddings using Shared Nearest Neighbor (SNN) method
Manual annotation of individual clusters based on marker gene expression, gene expression gradients, and gene-set enrichment analysis

dependent variables

Clustering of integrated single-cell data
Identification and annotation of cell types and states in the TIL and LCMV reference maps

control variables

Filtering of single-cell data using TILPRED-1.0 to remove non-T cells and enrich for T cell markers
Exclusion of cell cycling genes, mitochondrial, ribosomal, and non-coding genes, as well as genes expressed in <0.1% or >90% of cells in the dataset
Selection of 800 variable genes for dataset integration, prioritizing genes found in multiple datasets

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!