Total RNA was extracted for six biological replicates, using the QIAGEN RNeasy Mini Preparation Kit (QIAGEN Ltd, Crawley, UK). Library preparation using the NEBNext® Ultra™ RNA Library Prep Kit and 150bp paired-end sequencing was carried out on the Illumina Novaseq 6000 platform at the Beijing Genomics Institute. Using a Bcbio 1.1.1 RNA-seq pipeline, reads were aligned using Hisat2 2.1.0 [67 (link)], counted using DEXSeq [68 (link)] and Salmon 0.11.3 [69 (link)] and normalised using the R package DESeq2 [70 (link)]. DESeq2 was also used to determine differentially expressed genes between WT and mutant cell lines using shrunken log2 fold changes. Heatmaps were generated using the gplots R package. GSEA analysis was performed using the Molecular Signatures Database ‘Hallmarks’ gene set collection [71 (link)]. Venn diagrams were created using jvenn [72 (link)]. Grouping of knock-in clones was visualised using the t-Distributed Stochastic Neighbour Embedding (t-SNE) method for dimensionality reduction, using regularised log2-transformed (rlog) read counts as input. The t-SNE model was run using the ‘Rtsne’ R package, with a perplexity of 30, theta of 0.5, and other parameters kept at default settings. RNA-seq data have been deposited with the NCBI Gene Expression Omnibus (GEO) (http://ncbi.nlm.nih.gov/geo/) under accession number GSE147745.
Free full text: Click here