Protocol detail

Find Similar Protocols

Parallel Preprocessing of Paired-End Sequencing Data with fastp

fastp is designed for multi-threading parallel processing. Reads loaded from FASTQ files will be packed with a size of N (N = 1000). Each pack will be consumed by one thread in the pool, and each read of the pack will be processed. Each thread has an individual context to store statistical values of the reads it processes, such as per-cycle quality profiles, per-cycle base contents, adapter trimming results and k-mer counts. These values will be merged after all reads are processed, and a reporter will generate reports in HTML and JSON formats. fastp reports statistical values for pre-filtering and post-filtering data to facilitate comparisons of changes in data quality after filtering is complete.
fastp supports single-end (SE) and paired-end (PE) data. While most steps of SE and PE data processing are similar, PE data processing requires some additional steps such as overlapping analysis. For the sake of simplicity, we only demonstrate the main workflow of paired-end data preprocessing, shown in Figure 1.

Algorithm 1 adapter sequence detection

for seed in sorted_adapter_seeds:

seqs_after_seed = get_seqs_after(seed)

forward_tree = build_nucleotide_tree(seqs_after_seed)

found = True

node = forward_tree.root

after_seed = “”

while node.is_not_leaf():

if node.has_dominant_child():

node = node.dominant_child()

after_seed = after_seed + node.base

else:

found = False

break

if found == False:

continue

else:

seqs_before_seed = get_seqs_before(seed)

backward_tree = build_nucleotide_tree(seqs_before_seed)

node = backward _tree.root

before_seed = “”

while node.is_not_leaf():

if node.has_dominant_child():

node = node.dominant_child()

before_seed = node.base + before_seed

else:

break

adapter = before_seed + seed + after_seed

break

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Chen S., Zhou Y., Chen Y, & Gu J. (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics, 34(17), i884-i890.

Publication 2018

Child Leaf Nucleotide Root Seqs Tree

Corresponding Organization : Shenzhen Institutes of Advanced Technology

Top 5 similar protocols

Protocol cited in 510 other protocols

Variable analysis

independent variables

Reads loaded from FASTQ files

dependent variables

Per-cycle quality profiles
Per-cycle base contents
Adapter trimming results
K-mer counts

control variables

None explicitly mentioned

positive controls

None explicitly mentioned

negative controls

None explicitly mentioned

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!