Data quality control and assembly were performed using an established pipeline [28 (link)]. Briefly, raw reads were trimmed using Trimmomatic (version 0.39) with default parameters settings [29 (link)]. Assemblies were generated by unicycler (v0.3.0b) [30 (link)], and checked for standard quality parameters using quast (version 5.0.2) [31 (link)]. Genomes that met all the following quality criteria were included in the downstream analyses: estimated mean read coverage cut-off of ≥30×, 1.6 Mb ±160 kb genome assemblies, and contig count ≤200. The mean assembly length was 1 642 135 bp, with a mean read coverage of 98.2-fold, ranging from 30.2- to 269.4-fold. The mean G+C content was 30. 85 mol%, ranging from 30.06 to 31.13 mol%. A long-read assembly of strain N18-1277 was generated using Flye 2.8.1 [32 (link)], long-read polished with Medaka v1.5.0 (https://github.com/nanoporetech/medaka) and short-read polished with Polypolish v0.5.0 [33 (link)] and polca implemented in MaSuRCA v4.0.4 [34 (link)].
Free full text: Click here