Read mapping to the genome was performed with the MATS pipeline (Shen et al. 2012 (link)), which uses TopHat (Trapnell et al. 2009 (link)) and an input annotation to map the reads. Reads mapping to de novo splice junctions were allowed, and those reads mapping to more than one genomic position were filtered out. For benchmarking, the same annotation used for transcript quantification was also used for read mapping to the genome in each of the comparisons (RefSeq, Ensembl, or de novo Cufflinks). The mapping pipeline was run on simulated and real RNA-seq reads. Mapped reads for each of the data sets were used with MATS, to obtain ΨMATS values for the different alternative splicing events (Supplemental Table 5). Similarly, mapped reads in SAM format were converted to BAM format and then sorted with Samtools (Li et al. 2009 (link)) and analyzed with MISO (Katz et al. 2010 (link)) to calculate the ΨMISO values for each of the data sets (Supplemental Table 6).
Sailfish (Patro et al. 2014 (link)) and RSEM (Li and Dewey 2011 (link)) were used to quantify all transcripts in the Ensembl and RefSeq annotations using the simulated and the real RNA-seq data sets. The FASTA sequences of the transcripts corresponding to the same annotation as the GTF described earlier, were downloaded and used to generate the Sailfish index, selecting a k-mer size of 31 to minimize the number of reads assigned to multiple transcripts. Sailfish was then run using the FASTQ files for each read set and uncorrected and corrected (for sequence composition bias and transcript length) TPMs were calculated (Patro et al. 2014 (link)). RSEM was run as described previously (Li and Dewey 2011 (link)). The psiPerEvent operation of SUPPA was used to calculate the ΨSailfish and ΨRSEM values from the transcript quantifications obtained by Sailfish and RSEM, respectively, for the alternative splicing events generated before, using the simulated and real data sets. The number of events for which SUPPA estimated a ΨSailfish or ΨRSEM value is given in Supplemental Tables 7 and 8. For the purpose of benchmarking, the PSI values obtained from SUPPA (ΨSailfish and ΨRSEM), MATS (ΨMATS), and from MISO (ΨMISO) for those events identified by all methods in each experiment were compared with the simulated or the experimental values. Details of the commands used to run the different analyses are provided in Supplemental Tables 9–12. The alternative splicing events used in each of the comparisons tested can be found in Supplemental Data file 1 (Synthetic data with RefSeq), file 2 (Experimental data with RefSeq), file 3 (Exp. data with Ensembl), file 4 (Exp. data with Ensembl CDS), file 5 (Exp. data with Cufflinks) and are available at https://bitbucket.org/regulatorygenomicsupf/suppa/downloads/Supplementary_Data.zip
Free full text: Click here