We used FastQC v0.11.8 (FastQC, RRID:SCR_014583) [12 ] to assess overall sequencing quality for MGI and Illumina sequencing platforms. PCR duplications (reads were considered duplicates when forward read and reverse read of the 2 PE reads were identical) were detected by PRINSEQ v0.20.4 (PRINSEQ, RRID:SCR_005454) [26 (link)]. The random sequencing error rate was calculated by measuring the occurrence of “N” bases at each read position in raw reads. Reads with sequencing adapter contamination were examined according to the manufacturer's adapter sequences (Illumina sequencing adapter left = “GATCGGAAGAGCACACGTCTGAACTCCAGTCAC,” Illumina sequencing adapter right = “GATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT,” MGI sequencing adapter left = “AAGTCGGAGGCCAAGCGGTCTTAGGAAGACAA,” and MGI sequencing adapter right = “AAGTCGGATCGTAGCCATGTCGTTCTGTGAGCCAAGGAGTTG”). We conducted base quality filtration of raw reads using the NGS QC Toolkit v2.3.3 (cut-off read length for high quality 70; cut-off quality score, 20) (NGS QC Toolkit, RRID:SCR_005461) [27 (link)]. We used clean reads after removing low-quality reads and adapter-containing reads for the mapping step.
Free full text: Click here