Benchmarking LoMA Long-Read Assembly Tool

To estimate the accuracy of LoMA, we compared CSs assembled using the ONT data of NA18943 with GRCh38. We randomly selected 108 positions from the human genome while excluding centromeres and gaps (Additional file 1: Table S1). We collected all reads mapped within 20 kbp of each position from the data of NA18943 and constructed CSs using LoMA. We aligned the generated CSs to GRCh38 using minimap2 [16 (link)] and calculated the error rates from the edit distance. We also aligned all raw reads to GRCh38 and calculated error rates for the raw reads again using the edit distance. For a comparison, we assembled matched regions using lamassemble [15 ]:
-P 8 -a -v -p 2e-3 -m 2*(number of reads) -z 1000 promethion.mat
The error rate of lamassemble was calculated as above.
We also evaluated LoMA using simulated data. We randomly selected one hundred regions from GRCh38 (Additional file 1: Table S2). Simulated reads were generated using NanoSim with the error profile of NA12878 (total error rate, 10.8%) provided by the developers [23 (link)]. Various data sets were generated for each region: coverage 10, 20, 30, 40, and 50 (with a fixed size of 20 kbp), targeted size 20 kbp, 40 kbp, 60 kbp, 80 kbp, and 100 kbp (with a fixed mean coverage of 30×). The error rate, CPU time, and peak memory (RSS) were measured. A computer with M1 chip (Apple) was used to measure the performance. The error rate (edit distance) was calculated as described above.

Free full text: Click here

Ikemoto K., Fujimoto H, & Fujimoto A. (2023). Localized assembly for long reads enables genome-wide analysis of repetitive regions at single-base resolution in human genomes. Human Genomics, 17, 21.

Publication 2023

Centromeres Chip Human genome Loma Memory

Corresponding Organization :

Other organizations : The University of Tokyo

Top 5 similar protocols

Variable analysis

independent variables

Coverage (10x, 20x, 30x, 40x, 50x)
Targeted size (20 kbp, 40 kbp, 60 kbp, 80 kbp, 100 kbp)

dependent variables

Error rate (edit distance)
CPU time
Peak memory (RSS)

control variables

Fixed size of 20 kbp for the coverage experiment
Fixed mean coverage of 30x for the targeted size experiment
Error profile of NA12878 (total error rate, 10.8%) used for read simulation

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!