Robust Cell-Type Identification via scRNA-seq Alignment

Initial data exploration revealed that clustering was driven by individual of origin in addition to cell type identity, which makes it difficult to analyze changes in the relative abundance or gene expression of a given cell type across disease progression or brain regions. To recover clusters defined by mainly by cell type identity, data was aligned across samples from each brain region using with scAlign^{65 (link)} (version 1.0.0), which leverages a neural network to learn a low-dimensional alignment space in which cells from different datasets group by biological function independent of technical and experimental factors. As noted by Johansen & Quon^{65 (link)}, scAlign converges faster with little loss of performance when the input data is represented by principal components or canonical correlation vectors. Therefore, prior to running scAlign, the top 2000 genes with the highest combined biological variance were used as the feature set for canonical correlation analysis (CCA), which was implemented using Seurat::RunMultiCCA with parameter num.cc = 15. The number of canonical coordinates to use for scAlign was determined by the elbow method using Seurat::MetageneBicorPlot. scAlign was then run on the cell loadings along the top 10 canonical correlation vectors with the parameters options = scAlignOptions(steps = 10000, log.every = 5000, architecture = ‘large’, num.dim = 64), encoder.data = ‘cca’, supervised = ‘none’, run.encoder = TRUE, run.decoder = FALSE, log.results = TRUE, and device = ‘CPU’. Clustering was then performed on the full dimensionality of the ouptut from scAlign using Seurat::FindClusters with parameter resolution = 0.8 for the SFG and resolution = 0.6 for the EC. Clusters were visualized with tSNE using Seurat::RunTSNE on the full dimensinality of the output from scAlign with parameter do.fast = TRUE. Alignment using scAlign followed by clustering was also performed for all samples from both brain regions jointly.
To assign clusters identified in the aligned subspace generated by scAlign to major brain cell types, the following marker genes were used: SLC17A7 and CAMK2A for excitatory neurons, GAD1 and GAD2 for inhibitory neurons, SLC1A2 and AQP4 for astrocytes, MBP and MOG for oligodendrocytes, PDGFRA and SOX10 for oligodendrocyte precursor cells (OPCs), CD74 and CX3CR1 for microglia/myeloid cells, and CLDN5 and FLT1 for endothelial cells. Clusters expressing markers for more than one cell type, most likely reflecting doublets, were removed from downstream analyses.

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Leng K., Li E., Eser R., Piergies A., Sit R., Tan M., Neff N., Li S.H., Rodriguez R.D., Suemoto C.K., Leite R.E., Ehrenberg A.J., Pasqualucci C.A., Seeley W.W., Spina S., Heinsen H., Grinberg L.T, & Kampmann M. (2021). Molecular characterization of selectively vulnerable neurons in Alzheimer’s Disease. Nature neuroscience, 24(2), 276-287.

Publication 2020

Astrocytes Biological Biological function Brain Cell type Device Disease progression Elbow Endothelial cells Flt1 Gad1 Gene expression Genes Genes variance Inhibitory Microglia Myeloid cells Neurons Oligodendrocyte precursor cells Oligodendrocytes Origin driven Slc1a2 Sox10 Vectors

Corresponding Organization :

Other organizations : Institute for Neurodegenerative Disorders, University of California, San Francisco, Chan Zuckerberg Initiative (United States), Universidade de São Paulo

Top 5 similar protocols

Protocol cited in 11 other protocols

Variable analysis

independent variables

Alignment of data across samples from each brain region using scAlign

dependent variables

Cluster definitions by cell type identity
Relative abundance or gene expression of a given cell type across disease progression or brain regions

control variables

The number of canonical coordinates to use for scAlign was determined by the elbow method using Seurat::MetageneBicorPlot
ScAlign was run with the parameters options = scAlignOptions(steps = 10000, log.every = 5000, architecture = 'large', num.dim = 64), encoder.data = 'cca', supervised = 'none', run.encoder = TRUE, run.decoder = FALSE, log.results = TRUE, and device = 'CPU'
Clustering was performed on the full dimensionality of the output from scAlign using Seurat::FindClusters with parameter resolution = 0.8 for the SFG and resolution = 0.6 for the EC
Alignment using scAlign followed by clustering was also performed for all samples from both brain regions jointly

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!