Raw sequencing data (fastq files) were mapped to the human genome (build GRCh38) using CellRanger software (10x Genomics, version 3.0.2). Raw gene expression matrices generated per sample were analyzed with the Seurat package in R (39 (link)). To achieve clean cell clustering results, we divided the cell filtering process into two major steps: primary clustering and fine adjustment. Primary clustering: 5 samples (CAPS_NT, CAPS_TNFi, TRAPS_NT, TRAPS_TNFi, HD) were merged together and cells were filtered by nFeature_RNA (genes detected) > 200 and percent.mt (percentage of mitochondria genes) < 12.5%. High variable genes were selected by FindVariableFeatures and auto-scaled by ScaleData function using default parameters, and a principal component analysis (PCA) was performed for all datasets using the default RunPCA function in the Seurat package. Batch effect correction of each sample was done using the Harmony algorithm (21 (link)) based on PCA space, followed by FindNeighbors and FindClusters function (dims = 30, resolution = 0.5) in the Seurat package for unsupervised clustering. In total 20 clusters were found. As plasma cells were not correctly identified by unsupervised clustering, they were manually annotated. Fine adjustment: scDblFinder package was used to predict potential doublets in the datasets (40 (link)). As neutrophils and platelets naturally have much fewer transcripts than other cell types, and DCs are often misclassified as doublets, we divided all cell types into 3 groups and use different filter criteria for each group. Group1: including Naïve CD4, Naïve CD8, Memory CD4, Memory CD8, Effector CD8, CD16+ NK, CD16- NK, CD14 Mono, Naïve B, and Memory B, these cells were filtered by nFeature_RNA > 1200 and kept only singlets by scDblFinder; Group 2: including Neutrophil and Platelet, these cells were filtered by nFeature_RNA < 800 and kept only singlets by scDblFinder; Group 3, including FCGR3A Mono, DC, and pDC, these cells were filtered only by nFeature_RNA > 1200 regardless of scDblFinder prediction. In addition, 1 RBC cluster, 3 doublet clusters and 1 mitotic cluster were removed as they were not informative. All plasma cells were kept manually. After filtering, we ran the same pipeline as primary clustering mentioned above with slightly changes of several parameter (dims = 20, resolution = 0.4). Final clustering results were shown in Figure 1B.
Free full text: Click here