Prior to the transcriptome assembly, Illumina’s basespace pipeline was used for de-multiplexing and filtering high-quality sequencing reads. This was followed by quality filtering steps in Trimmomatic v0.39 [64 (link)], where the adapter sequences, leading and trailing low-quality bases (<3), short reads (<20 bases) and low-quality reads (<25; sliding window 4) were removed. The quality of FASTQ sample files before and after trimming was validated using FASTQC v0.11.9 [65 ]. Quality-filtered reads were then de novo assembled into contigs using Trinity v2.11.0 [66 (link)] with the following parameters: k-mer = 25, minimum k-mer coverage = 1, minimum contig length = 200, pair distance = 500 and the maximum number of reads per graph = 200,000. The quality of the assembled transcriptome was evaluated by aligning reads back onto the transcriptome using BowTie v2.4.2 [67 (link)]. This was followed by the prediction of coding regions in transcripts that encode a minimum of 30 amino acids using TransDecoder v5.5.0 [68 (link)]. These coding regions were then annotated by performing BLAST searches [69 (link)] against the NCBI-nr database (November 2020; Serpentes (taxid: 8570)).
Free full text: Click here