A bioinformatics pipeline was run in Puhti supercomputer cluster of CSC (Espoo, Finland) to quantify gene expression in the samples from RNA sequencing data. Paired-end RNA-seq reads of 378 brain samples were downloaded from Sequence Read Archive in FASTQ format with SRA Toolkit (v2.10.8). Low-quality ends (Phred score < 20) and Illumina Universal Adapters were trimmed with TrimGalore (v0.6.4; https://github.com/FelixKrueger/TrimGalore; 10.5.2021). Other quality filtering was performed with following qualifiers of PRINSEQ (lite v0.20.4) [11 (link)]: read length ≥ 50 nucleotides, mean quality score of read ≥25, proportion of ambiguous bases ≤1%, filter all kinds of duplicates, DUST score measuring low complexity ≤7. Quality filtering was confirmed with FastQC (v0.11.8; https://www.bioinformatics.babraham.ac.uk/projects/fastqc; 10.5.2021). Quality reads were aligned with STAR (v2.7.1a) [12 (link)] against human reference genome (GCF_000001405.26_GRCh38_genomic.fna from NCBI).
Free full text: Click here