The largest database of trusted experimental protocols

Fastx toolkit

Manufactured by Illumina

The FASTX-Toolkit is a collection of command-line tools for preprocessing and manipulation of FASTA/FASTQ files. It provides utilities for sequence quality control, trimming, and format conversion.

Automatically generated - may contain errors

12 protocols using fastx toolkit

1

Pacbio-Illumina Hybrid Genome Sequencing

Check if the same lab product or an alternative is used in the 5 most similar protocols
The genome was sequenced using PacbioRS, which can generate continuous long reads (CLRs) of up to 10 kb in length, and can be used to upgrade draft genomes containing gaps using PBJelly (Ver. 12.9.14) [22 (link)]. However, CLRs show only 82.1% to 84.4% base accuracy [55 (link)]. Thus, error correction was performed using the command pacBioToCA [56 (link)] with the parameters -length 500, -partitions 200, -shortReads, -l NC, -t 20, and -s pacbio.spec. Illumina (50× read coverage of genome) reads were used for correction. Illumina reads were trimmed using FASTX-Toolkit [56 (link)] with the parameters -t 20, -l 50, and -Q 33. Pacbio.spec files specified the parameters for overlapping Illumina and pacbio data for correction: utgErrorRate = 0.25, utgErrorLimit tgErrorLcnsErrorRate = 0.25, cgwErrorRate = 0.25, ovlErrorRate = 0.25, and merSize = 10. After correction, pacBio-corrected reads were analyzed using FastQC [57 ]. A total of 2,640,379 CLRs (7.6× read coverage of genome) were used for error-correction, which generated 2,415,333 error-corrected reads (2.3× read coverage of genome) (Additional file 1: Table S1). The average CLR length decreased from 1,819 to 969 bp. The resulting error-corrected CLRs were used for gap filling.
+ Open protocol
+ Expand
2

RNA-seq Transcriptome Assembly of Insects

Check if the same lab product or an alternative is used in the 5 most similar protocols
Insects used for RNA-seq were collected from the same laboratory population described above and reared on ~ 70% humidity coffee parchment. Total RNA was isolated from pooled whole-body female and male adults (30 and 50 individuals, respectively), separately, using RNeasy Mini Kit (Qiagen) and including a DNase I step to remove genomic DNA contamination. Illumina RNA-seq single-end library construction using TruSeq RNA Library Prep Kit v2 and sequencing through a HiSeq2500 platform were performed by BGI (Hong Kong). Raw Illumina reads was adaptor-removed, trimmed and filtered according to quality using default parameters of the Fastx-Toolkit v.0.014. Transcript assembly was performed using rnaSPades v.3.14.025 (link) with default parameters. Transcript redundancy was reduced by clustering sequences with CD-HIT v.4.8.126 (link) at default options. Removal of sequence contamination was performed using BLASTn search as described above.
+ Open protocol
+ Expand
3

Whole-Genome Sequencing for Listeria monocytogenes

Check if the same lab product or an alternative is used in the 5 most similar protocols
Whole-genome sequencing data for the 180 L. monocytogenes isolates were processed using Haplo-ST (S1 Fig, [26 (link)]) for allelic profiling of 2554 genes per isolate. Haplo-ST first cleaned raw Illumina whole-genome sequencing reads obtained as previously described (S1 File) using the FASTX-Toolkit [27 ]. Next, reads were trimmed to remove all bases with a Phred quality score of < 20 from both ends and filtered such that 90% of bases in the clean reads had a quality of at least 20. After trimming and filtering, all remaining reads with lengths of < 50 bp were filtered out. Next, Haplo-ST used YASRA [28 ] to assemble the cleaned reads into allele sequences and provided wgMLST profiles to the assembled allele sequences with BIGSdb-Lm (available at http://bigsdb.pasteur.fr/listeria).
+ Open protocol
+ Expand
4

Full-length genomic read extraction

Check if the same lab product or an alternative is used in the 5 most similar protocols
After quality filtering and Illumina sequencing adaptor trimming with FASTX-Toolkit (v0.0.13), the raw paired-end reads were merged to single-end reads by using FLASh software (v1.2.11). The correlated 5′-end and 3′-end sequences were extracted by the custom script (fasta_to_paired.sh) using the SeqKit (v2.4.0) and Cutadapt (v4.1) packages. The inferred full-length reads were generated by Bedtools (v2.31.0) and Samtools (v1.17) after mapping to the reference genome (NC_000913.3 for Eco, NC_008596.1 for Msm and NC_018143.2 for Mtb) with Bowtie 2 (v2.5.1). The full-length reads with an insert length greater than 10,000 nt were discarded. The mapping results were visualized using the IGV genome viewer (v2.4.10). Data analysis and visualization scripts used Python packages including Matplotlib (v3.7.1), Numpy (v1.24.3), Scipy (v1.10.1), bioinfokit (v0.3), and pyCircos (v0.3.0).
+ Open protocol
+ Expand
5

Genome Assembly of Fungal Pathogens

Check if the same lab product or an alternative is used in the 5 most similar protocols
Illumina paired-end reads were quality filtered using FastX tool kit (version 0.0.13.2). Adapter sequences were clipped using Cutadapt version 1.2.1 [29] . Then paired reads having at least 80% of bases with quality score greater than Q30 (Q score is quality score specified by Illumina, which indicates probability of errors in base calling. Q30 means a probability of incorrect base call is in 1 in 1000) were chosen for further analysis. We attempted both de novo and reference based assembly of genomes using Velvet 1.2.09, however reference based assembly was used for further analysis since it yielded better assembly [30] (link). M. oryzae 70-15 was used as a reference strain for reference based assembly. The whole genome assembly is available at NCBI/DDBJ/EMBL with the accession AXDJ01000000 for B157 and AYPX01000000 for MG01.
Contig ordering, gap filling and re-scaffolding was performed using various integrated tools in order to improve assembly quality. We used the ABACAS tool for contig ordering with reference [31] (link). The Iterative Mapping and Assembly for Gap Elimination (IMAGE) [32] (link) method was used to fill the gaps in the assembly. The pre-assembled contigs were merged back to scaffolds after successful completion of iterative assembly using SSPACE (SSAKE-based Scaffolding of Pre-Assembled Contigs after Extension) [33] (link).
+ Open protocol
+ Expand
6

High-throughput sequencing of Hth-Exd complexes

Check if the same lab product or an alternative is used in the 5 most similar protocols
Libraries for HthFL-Exd and HthFL-ExdR2A,R5A (Lib-16) were sequenced using a v2 75-cycle high-output kit on an Illumina NEXTSeq Series desktop sequencer at the Genome Center at Columbia University. Libraries Lib-Hth-F and Lib-Hth-R with either Hth or Exd shape-readout mutant in complex with the respective other wild-type protein and Dfd, as well as the Lib-30 HthFL-Exd-Dfd experiment were all sequenced at the New York Genome Center using separate lanes on an Illumina HiSeq 2000 sequencing machine. Libraries Lib-Hth-F and Lib-Hth-R with wild-type proteins were also sequenced on a HiSeq instrument at a different facility. Libraries were trimmed to remove Illumina- and library-internal adapter sequences using the FASTX toolkit (Hanon lab) and loaded into the R environment using the R package named SELEX (http://bioconductor.org/packages/SELEX) (Riley, 2014 (link)).
+ Open protocol
+ Expand
7

Illumina Sequence Reads Quality Control

Check if the same lab product or an alternative is used in the 5 most similar protocols
Illumina sequence reads were analyzed for their quality and adjusted using the FASTX-Toolkit. The FASTX Artifacts Filter was used to eliminate reads containing artifacts such as poly-A regions. Most of the reads containing artifacts have been eliminated by Illumina itself already. The FASTQ Quality Filter set to a minimum quality score threshold of 20 and a minimum read length of 47 was used to eliminate low quality reads. The FASTX Trimmer served to eliminate single bases showing very low quality in all reads.
+ Open protocol
+ Expand
8

Breast Cancer miRNA Sequencing Protocol

Check if the same lab product or an alternative is used in the 5 most similar protocols
Small RNA sequencing was performed on single lane of Illumina HiSeq 1000 with eight multiplex libraries from the four breast cancer cell lines. The reads obtained from deep sequencing of small RNAs were subjected to Illumina adaptor trimming using FastX tool kit and were size filtered to select for candidate miRNA's (14 to 24 bases) from a pool of small RNA sequences using in-house perl script. The size separated reads were then mapped onto human miRNA reads obtained from miRBase (version 21) using Bowtie2 (version 2.1.0)16 (link) with 0 mismatches in the first 8 bases. MicroRNAs were quantified followed by normalisation by read per million using in-house script. Deregulated miRNAs with > = 3 fold change were retained for further analysis. For searching microRNAs targeting PR 3′UTR, differentially expressed microRNAs in response to progesterone were compared to microRNAs predicted to target PR using 6 algorithms (TargetScan, miRanda, miRWalk, miRMap, RNA22 and RNAhybrid).
+ Open protocol
+ Expand
9

High-throughput Sequencing with Purified Amplicons

Check if the same lab product or an alternative is used in the 5 most similar protocols
For targeted high-throughput sequencing, PAGE-purified primers containing the sequencing adapter and target sequence were used to produce amplicons. After amplification, libraries were PAGE separated and target fragments gel purified. Libraries were sequenced using the Illumina NextSeq500 platform. Reads were preprocessed using the FASTX Toolkit (Hannon laboratory) and reads less abundant than 0.01% of the most abundant read were excluded. Insertions and deletions were quantified using custom scripts and manually verified. For whole-genome sequencing, libraries were prepared using the TruSeq Stranded mRNA Library Prep Kit (Illumina), and reads were processed using the FASTX Toolkit and mapped to the TuMV genome with BWA.
+ Open protocol
+ Expand
10

Ribosome Profiling Analysis Pipeline

Check if the same lab product or an alternative is used in the 5 most similar protocols
To process RPF sequencing reads, Illumina adapters were removed using fastx_clipper from the FASTX-Toolkit. Ribosomal RNA and tRNA were removed using Bowtie version 1.0.05. Remaining reads were aligned to the genome (hg19 / GRCh37) and transcriptome using STAR version 2.5.3a6 (--alignIntronMin 20 --alignIntronMax 100000 --outFilterMismatchNmax 1 -- outFilterType BySJout --outFilterMismatchNoverLmax 0.04 --twopassMode Basic). For the transcriptome annotation, a combination of GENCODE v26lift37 transcriptome annotation was combined with transcripts annotated as tstatus “unannotated” from MiTranscriptome annotation7 (link). To determine the RPF library quality, trinucleotide codon periodicity was plotted using RibORF readDist script8 against annotated protein-coding ORFs (GENCODE v26lift37). Only samples and read lengths that showed clear trinucleotide periodicity were used for subsequent ORF predictions.
+ Open protocol
+ Expand

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Sign up for free.
Registration takes 20 seconds.
Available from any computer
No download required

Sign up now

Revolutionizing how scientists
search and build protocols!