Standardized Single-Cell RNA-seq Normalization

We downloaded raw read or UMI matrices for all datasets, from their respective sources. The one exception was the 3pV1 dataset from the PBMC analysis. These data were originally quantified with the hg19 reference, while the other two PBMC datasets were quantified with GRCh38. Thus, we downloaded the fastq files from the 10X website (Supplementary Table 8). We quantified gene expression counts using Cell Ranger^{11 ,41} v2.1.0 with GRCh38. From the raw count matrices, we used a standard data normalization procedure, laid out below, for all analyses, unless otherwise specified. Except for the L₂ normalization and within-batch variable gene detection, this procedure follows the standard guidelines of the Seurat single cell analysis platform.
We filtered cells with fewer than 500 genes or more than 20% mitochondrial reads. In the pancreas datasets, we filtered cells with the same thresholds used in Butler et al⁷: 1750 genes for CelSeq, 2500 genes for CelSeq2, no filter for Fluidigm C1, 2500 genes for SmartSeq2, and 500 genes for inDrop. We then library normalized each cell to 10,000 reads, by multiplicative scaling, then log scaled the normalized data. We then identified the top 1000 variable genes, ranked by coefficient of variation, within in each dataset. We pooled these genes to form the variable gene set of the analysis. Using only the variable genes, we mean centered and variance 1 scaled the genes across the cells. Note that this was done in the aggregate matrix, with all cells, rather than within each dataset separately. With these values, we performed truncated SVD keeping the top 30 eigenvectors. Finally, we multiplied the cell embeddings by the eigenvalues to avoid giving eigenvectors equal variance.

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Korsunsky I., Millard N., Fan J., Slowikowski K., Zhang F., Wei K., Baglaenko Y., Brenner M., Loh P.R, & Raychaudhuri S. (2019). Fast, sensitive, and accurate integration of single cell data with Harmony. Nature methods, 16(12), 1289-1296.

Publication 2019

Gene expression Genes Library Mitochondrial Pancreas Single cell analysis

Corresponding Organization :

Other organizations : Brigham and Women's Hospital, Harvard University

Top 5 similar protocols

Protocol cited in 721 other protocols

Variable analysis

independent variables

Quantification method used (Cell Ranger v2.1.0 with GRCh38 reference)

dependent variables

Gene expression counts

control variables

Thresholds used for filtering cells (fewer than 500 genes or more than 20% mitochondrial reads, dataset-specific thresholds for pancreas datasets)
Library normalization to 10,000 reads per cell
Log scaling of normalized data
Identification of top 1000 variable genes
Mean centering and variance 1 scaling of variable genes across all cells
Truncated SVD with 30 eigenvectors

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!