Genomic DNA Library preparation was performed using a modified version of the Illumina TruSeq DNA Sample Preparation protocol. A MiSeq sequencing run was then performed with a read length of 301 bp (paired-end). The raw FASTQ files were processed with cutadapt (version 2.5) and the following arguments: –overlap 10 -m 30 -q 20,20 –quality-base 33. Reads were assembled using SPAdes (version 3.13.0, executed with default parameters except -k 21,33,55,77,99,127 –meta -t 44) (Bankevich et al., 2012 (link)). The SPAdes contig fasta file was processed using the R package RKXM1 and the chromosomal genome was manually binned in the GC-coverage plane. Genome quality statistics were obtained again using CheckM (version 1.0.11) (Parks et al., 2015 ). The concordance statistic was computed between contigs in short read assembly and the long read assembled chromosome using the R package srac2lrac (Arumugam et al., 2019 (link), 2021 (link)). Coverage profiles of short read data against the long read assembly was achieved using the same methods described in the immediately preceding section, with minimap2 settings -ax sr -a -t 20.
Free full text: Click here