We used as example to test our website two datasets. The first example contains data from the Pasilla Bioconductor library (Brooks et al., 2010 (link)), taking in account only the gene level counts. This dataset contains RNA-Seq count data for treated and untreated cells from the S2-DRSC cell line. The second example file which can be used to test the batch effect error awareness, was taken from the NBPSeq CRAN package (Di et al., 2014 ). This dataset contains the Arabidopsis thaliana RNA-Seq data (Cumbie et al., 2011 (link)), comparing ΔhrcC challenged and mock-inoculated samples. In this case, the samples were collected in three batches.
We also obtained RNA-Seq publicly available data already reported (Olvera et al., 2017 (link)) that was generated to determine the effect of 3,5-di-iodothyronine (T2) and 3,5,3′-tri-iodothyronine (T3) exogenous treatment on the transcriptome of tilapia (Oreochromis niloticus) liver. For control and each hormone treatment, two biological replicates were generated. The FASTQ raw data can be found under the following SRA identifiers: SRX2630485, SRX2630486, SRX2630487, SRX2630488, SRX2630489, and SRX2630490.
Briefly, the quality control(QC) and filtering for the raw data was performed using the FASTQC software (Babraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data Babraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data) and contamination and adapter removal was carried out using in-house Perl scripts. QC’ed reads were mapped using the Bowtie 1.1.234 aligner (Langmead et al., 2009 (link)) to the annotated Oreochromis_niloticus (Orenil1.0.cds.all, 21,437 coding genes) CDS dataset downloaded from Ensembl repository database (Aken et al., 2016 (link)) using the BioMart utility. Quantification and repetitiveness normalization were carried out using eXpress software 1.535 (Roberts et al., 2011 (link)). Total effective counts for each sample were merged; a matrix was generated using the “abundance_estimates_to_matrix.pl” Perl script included in the Trinity pipeline (Grabherr et al., 2011 (link); Roberts et al., 2011 (link)). The resulting matrix was used as input for the differential expression analysis in the IDEAMEX web server. The select parameters were: p-adj/FDR = 0.05; logFC = 2; CPM = 1.
We also obtained RNA-Seq publicly available data already reported (Olvera et al., 2017 (link)) that was generated to determine the effect of 3,5-di-iodothyronine (T2) and 3,5,3′-tri-iodothyronine (T3) exogenous treatment on the transcriptome of tilapia (Oreochromis niloticus) liver. For control and each hormone treatment, two biological replicates were generated. The FASTQ raw data can be found under the following SRA identifiers: SRX2630485, SRX2630486, SRX2630487, SRX2630488, SRX2630489, and SRX2630490.
Briefly, the quality control(QC) and filtering for the raw data was performed using the FASTQC software (Babraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data Babraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data) and contamination and adapter removal was carried out using in-house Perl scripts. QC’ed reads were mapped using the Bowtie 1.1.234 aligner (Langmead et al., 2009 (link)) to the annotated Oreochromis_niloticus (Orenil1.0.cds.all, 21,437 coding genes) CDS dataset downloaded from Ensembl repository database (Aken et al., 2016 (link)) using the BioMart utility. Quantification and repetitiveness normalization were carried out using eXpress software 1.535 (Roberts et al., 2011 (link)). Total effective counts for each sample were merged; a matrix was generated using the “abundance_estimates_to_matrix.pl” Perl script included in the Trinity pipeline (Grabherr et al., 2011 (link); Roberts et al., 2011 (link)). The resulting matrix was used as input for the differential expression analysis in the IDEAMEX web server. The select parameters were: p-adj/FDR = 0.05; logFC = 2; CPM = 1.
Full text: Click here