RNA from sorted cells was purified (QIAGEN) and hybridized to Affymetrix mouse 430 2.0 arrays (Memorial Sloan-Kettering Cancer Center, Genomics Core Facility). Raw data (CEL files) were normalized by RMA using Affy R package. Principal component analysis was performed using arrayQualityMetrics R package. Differential expression analysis was performed between any two subsets using limma R package (Adjusted P-value < 0.05 and fold-change > 1.5). Gene Set Enrichment Analysis (GSEA) was run for each cell subset in pre-ranked list mode with 1,000 permutations (nominal P-value cutoff < 0.01). As gene sets for the GSEA analyses, we used Reactome pathways (http://www.reactome.org/); the MSigDB gene sets related to haematopoietic stem cells34 ; and gene signatures associated with TFH or TH1 CD4+ T cells33 , memory precursor/terminal effector CD8+ T cells35 and thymic innate TFH-like CD4+ T cells36 . To define these signatures, we downloaded the microarray data from GEO database (GSE16697, GSE8678 and GSE64779); collapsed probe sets that matched to the same gene symbol by taking the one with highest expression across all samples; removed genes with lowest 30% mean expression; and performed differential expression analysis between the two classes using limma (adjusted P-value < 0.01 and fold-change > 2). Enrichment scores were visualized using the corrplot package in R. Enrichment scores of Reactome pathways and the genes shared by two pathways were represented as nodes and links, respectively using Cytoscape software. The microarray data are available in the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo) under the accession number GSE84105.