WGS was performed on SP strain C5347 using PacBio Sequel and Illumina Miseq 2 × 300 bp platforms prior phenol-chloroform DNA extraction, as previously described [44 (link)]. Raw PacBio reads were assembled using Canu [45 (link)] with default parameters and setting an estimated genome size of 3 Mb. The resulting assembled contigs were then polished as follows: First, Illumina raw reads were quality-trimmed using Trimmomatic [46 (link)] and aligned against the assembled PacBio contigs using Bowtie2 [47 (link)]. Then, the resulting bam files were used to fix individual base errors, indels and local missassemblies using Pilon [48 (link)].
Resulting genes on the assembled contigs were predicted using Prodigal [49 (link)]. tRNA and rRNA genes were predicted using tRNAscan-SE [50 (link)], ssu-align [51 ] and meta-rna [52 (link)]. Predicted protein sequences were compared against the NCBI nr database using DIAMOND [53 (link)], and against COG [54 (link)] and TIGFRAM [55 (link)] using HMMscan [56 (link)] for taxonomic and functional annotation.
Free full text: Click here