The genomic DNA of A02 was sequenced by combining next-generation sequencing platforms (Illumina paired end, 2*90-bp, and 500-bp insert size) and SMRT sequencing (Pacific Biosystems RS) by the Wuhan Institute of Biotechnology (Wuhan, China). PacBio RS (loBPng) reads were cleaned with sub-reads in the SMRT portal, and only clean reads were included in the subsequent analyses. For assembly of the SMRT sequencing reads, the longest reads were first utilized as seeds to recruit all other short reads for the construction of highly accurate preassembled reads through a consensus procedure with HGAP336 (link). Thereafter, the preassembled reads were constructed by aligning all of the reads to each of the seed reads using BLASR37 (link). After the preassembly step, the resulting preassembled reads typically had read accuracies above 99%. Celera Assembler38 (link) was then used to assemble all of the clean reads to the preassembly, and pilon was applied to generate the best consensus sequence as the final genome sequence result. The method used for correcting Pacbio RSII assembly using the data from the Illumina MiSeq. 2000 was pilon39 (link).
Free full text: Click here