The generation of RNA-Seq data in the Accelerating Medicines Partnership for Alzheimer’s Disease Consortium (AMP-AD) and demographic information has been previously described in detail [33 (link)]. Briefly, RNA-Seq data were downloaded from the (AMP-AD) through the Synapse database (https://www.synapse.org/): the Mayo Clinic Brain Bank (Mayo Clinic) [34 (link)], the Mount Sinai Medical Center Brain Bank (MSBB) [35 ], and the Religious Orders Study and Memory and Aging Project (ROS/MAP) cohorts [36 (link)].
In the Mayo Clinic, RNA-Seq data were generated from the temporal cortex and cerebellum. In the MSBB, RNA-Seq data were generated from the parahippocampal gyrus, inferior frontal gyrus, superior temporal gyrus, and frontal pole. In ROSMAP, RNA-Seq data were generated from the dorsolateral prefrontal cortices. The procedures for sample collection, post-mortem sample descriptions, tissue and RNA preparation methods, library preparation and sequencing methods, and sample quality controls were previously described in detail [34 (link)–39 (link)]. We converted each mapped BAM file into a FASTQ file using samtools v.1.9 and then re-mapped the converted FASTQ files onto the hg19 human reference genome using STAR aligner v.2.5, as previously described in detail [40 (link)]. Using the processed RNA-Seq data, we identified TREM2 splice transcripts and calculated their expression levels. We used the software tool RSEM to accurately estimate the TREM2 transcripts expressions from RNA-Seq [41 (link)]. RSEM generates three different TREM2 transcript sequence references, and RNA-Seq reads are mapped to them. After the alignment of reads, RSEM uses a statistical model to accurately calculate transcript abundances by estimating a maximum likelihood (ML) based on expectation-maximization (EM) algorithm. Additionally, by utilizing paired-end reads to classify the different isoforms, RSEM improves the estimation of the relative isoform levels within single genes. Based on RSEM’s statistical model and additional benefits, it accurately estimates transcript abundances from reads mapped to distinct and shared regions among the three isoforms. Differential expression analysis of the TREM2 splice transcripts between cognitively normal controls and AD patients was done using a generalized linear regression model [33 (link)]. The regression was performed with the “glm” function of the stats package in R (version 3.6.1). Age and sex were used as covariates. We used the false discovery rate (FDR) to correct for multiple testing.
Free full text: Click here