The largest database of trusted experimental protocols

67 protocols using fastqc

1

Evaluating Quality and Contamination in Sequencing Data

Check if the same lab product or an alternative is used in the 5 most similar protocols
The technical quality and potential sample contamination in Illumina PE reads were evaluated using FastQC v0.11.8 (FastQC, RRID:SCR_014583) and FastQ Screen v0.11.1 (FastQ Screen, RRID:SCR_000141), respectively. The technical quality of PacBio raw data was checked using the “QC module” in the PacBio SMRT Analysis Software SMRT Link version 8.0 (SMRT-Analysis, RRID:SCR_002942). Iso-Seq reads were clustered into high-quality (accuracy 99.9%, HQ) transcripts using the “Iso-Seq Analysis” Application in PacBio SMRT Analysis Software (SMRT Link v10.1.0.119588). The technical quality of Hi-C data was checked using HiCUP v0.8.0 (HiCUP, RRID:SCR_005569) [53 (link)].
+ Open protocol
+ Expand
2

Time-Resolved Transcriptomic Profiling of Human Cells

Check if the same lab product or an alternative is used in the 5 most similar protocols
Raw reads were trimmed to remove low quality base calls and Illumina universal adapters using Trim Galore! (Version 0.6.5) with default parameters and then assessed using fastQC (version 0.11.4) and multiqc (version 1.10.1). Reads were then aligned to the human genome (GRCh38) using STAR with default parameters. Alignment quality control was performed using RSeQC and Qualimap. Quantification was performed using RSEM. Quantification quality control was performed using EDASeq (version 2.3) and NOISeq (version 2.4). Time-course differential expression analysis was performed using msSigPro (version 1.68). Clustering of differential time-course genes was performed by identifying the optimal number of clusters using mclust (version 5.4.1) and then clustering using k-means method. Gene ontology analysis was performed using clusterProfiler (version 4.4.4) where cell-type enrichments utilized MSigDB (version 7.5.1). The code used for the analysis is provided in Additional file 2.
+ Open protocol
+ Expand
3

Genome Sequencing of Evolved Nitrite Strains

Check if the same lab product or an alternative is used in the 5 most similar protocols
We sequenced the genomes of evolved isolates using methodology described elsewhere [20 (link)]. Briefly, we streaked each evolved nitrite (NO2) cross-feeding co-culture onto LB agar plates containing 10 μg ml− 1 of gentamicin and 0.1 mM of IPTG and picked a single colony of the nitrite producing and reducing strain (each colony expressed a different fluorescent protein) from each co-culture for genome sequencing. We grew the single clones in LB medium overnight and extracted the DNA with a Wizard Gemoic DNA purification kit (Promega, Madison, WI). We then sent the extracted DNA to the ETH Quantitative Genomics Facility (Basel, Switzerland) for sequencing. The genomes were sequenced with an Illumina HiSeq 200 sequencer (Illumina, San Diego, CA) with 100 cycles of paired-end sequencing. Primary data analysis, de-multiplexing and quality control analysis of the sequencing data were performed using FastQC (Illumina, San Diego, CA). We reported the complete set of parameters used for quality control elsewhere [17 (link)].
+ Open protocol
+ Expand
4

Quality Assessment and Filtration of NGS Sequencing Data

Check if the same lab product or an alternative is used in the 5 most similar protocols
We used FastQC v0.11.8 (FastQC, RRID:SCR_014583) [12 ] to assess overall sequencing quality for MGI and Illumina sequencing platforms. PCR duplications (reads were considered duplicates when forward read and reverse read of the 2 PE reads were identical) were detected by PRINSEQ v0.20.4 (PRINSEQ, RRID:SCR_005454) [26 (link)]. The random sequencing error rate was calculated by measuring the occurrence of “N” bases at each read position in raw reads. Reads with sequencing adapter contamination were examined according to the manufacturer's adapter sequences (Illumina sequencing adapter left = “GATCGGAAGAGCACACGTCTGAACTCCAGTCAC,” Illumina sequencing adapter right = “GATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT,” MGI sequencing adapter left = “AAGTCGGAGGCCAAGCGGTCTTAGGAAGACAA,” and MGI sequencing adapter right = “AAGTCGGATCGTAGCCATGTCGTTCTGTGAGCCAAGGAGTTG”). We conducted base quality filtration of raw reads using the NGS QC Toolkit v2.3.3 (cut-off read length for high quality 70; cut-off quality score, 20) (NGS QC Toolkit, RRID:SCR_005461) [27 (link)]. We used clean reads after removing low-quality reads and adapter-containing reads for the mapping step.
+ Open protocol
+ Expand
5

NextSeq Sequencing and Data Processing

Check if the same lab product or an alternative is used in the 5 most similar protocols
The libraries were sequenced using a 75 bp Illumina NextSeq 400M high output kit (parameters of the sequencing run can be found in Table S2). In addition, 5% PhiX were used as a spike-in control. Illumina’s bcl2fastq script was used to generate the fastq files, which were subsequently quality controlled using FastQC (Andrews, 2010 ). The data was further filtered, quantified (ran with the option–min-reads 1000 to discard sequencing background from the downstream analysis), and sorted using the inDrop analysis pipeline (parameters of the yaml can be found in Table S2).
+ Open protocol
+ Expand
6

RNA-seq Libraries Preparation with TruSeq

Check if the same lab product or an alternative is used in the 5 most similar protocols
The RNA-seq libraries were prepared with the TruSeq® Stranded mRNA sample prep kit (Illumina). Samples depleted of rRNA were fragmented and reverse-transcribed with random hexamers, Superscript II (Life Technologies) and actinomycin D. During the generation of the second strand, dTTP was replaced with dUTP. Double-stranded cDNAs were adenylated at their 3′ ends before ligation with Illumina indexed adapters. Ligated cDNAs were amplified by 15 cycles of PCR and purified with AMPure XP Beads (Beckman Coulter Genomics). Libraries were validated with a DNA1000 chip (Agilent) and quantified with the KAPA Library quantification kit (Clinisciences). Twelve libraries were pooled in equimolar amounts in a single lane and were sequenced on a HiSeq2000 machine, with the single-read protocol (50 nt). Image analysis and base-calling were performed with Illumina HiSeq Control Software and the Real-Time Analysis component. Demultiplexing was performed with Illumina sequencing analysis software (CASAVA 1.8.2). Data quality was assessed with FastQC from the Babraham Institute and the sequencing analysis viewer (SAV) from Illumina software.
+ Open protocol
+ Expand
7

Small RNA Sequencing Data Analysis

Check if the same lab product or an alternative is used in the 5 most similar protocols
The raw data were processed with data cleaning analysis using ACGT101-miR v3.5 (LC Sciences, Huston, TX). In brief, the quality of raw data was measured by Illumina Fast QC to obtain Q30 data. Clean full-length reads were collected after removing all low-quality reads, adapter contaminants, and reads smaller than 18 nt and junk sequences (≥2 N, ≥7A, ≥8C, ≥6G, ≥7 T, ≥10 Dimer, ≥6 Trimer or ≥ 5 Tetramer). In addition, the clean data were filtered using various RNA databases, such as mRNA, RFam (release 9.1) and Repbase (version 15.07) databases, and rRNA, scRNA, snoRNA, snRNA, tRNA, etc. were found and removed as much as possible. The remaining unique sequences were mapped to the precursors in miRBase 21.0. by the fast gapped-read alignment software Bowtie 2 [88 (link)]. The unique sequences mapping to specific species mature miRNAs in hairpin arms were identified as known miRNAs. The unannotated sRNAs were expanded to about 250 nt and their structures were predicted using Mfold software (http://unafold.rna.albany.edu/?q=mfold). Novel miRNAs were obtained according to Meyers and Li prediction criteria [39 (link), 89 (link)].
+ Open protocol
+ Expand
8

Complete Bacterial Genome Assembly

Check if the same lab product or an alternative is used in the 5 most similar protocols
Total DNA was purified from overnight culture using Qiagen Genomic-tip 100/G columns (Qiagen, Germantown, MD, United States) per the manufacturer’s instructions. Whole-genome sequencing (WGS) was performed on the Illumina HiSeq 2500 platform and the Oxford MinIon Nanopore platform. FastQC (version 0.11.9)1 and NanoQC (Version 0.9.4)2 were used to assess the quality of short reads generated by Illumina and long reads generated by MinIon, respectively. High-quality long reads were assembled de novo using Canu (version 2.1.1)3 (Koren et al., 2017 (link)). Contigs were circularized by Circlator4 using the following parameters: merge_min_id, 85; merge_breaklen, 1,000; bwa_opts, -x ont2d; assembler, canu (Hunt et al., 2015 (link)). High-quality short reads were used to correct circularized contigs with two iterations of Pilon (version 1.24)5 (Walker et al., 2014 (link)) correction and one round of Racon (version 1.4.3)6 (Vaser et al., 2017 (link)) polishing. All programs were run with default parameters unless otherwise specified.
+ Open protocol
+ Expand
9

Illumina Paired-End Read Processing

Check if the same lab product or an alternative is used in the 5 most similar protocols
The initial quality of paired-end raw reads obtained from the Illumina sequencer was confirmed using the FASTQC (FASTQC/">https://www.bioinformatics.babraham.ac.uk/projects/FASTQC/) tool (Illumina). Unwanted regions in the reads (adapters, low-quality reads, and ambiguous bases ‘N) were trimmed, andhigh-quality trimmed reads were obtained for further analysis. The reads from each sample were normalized and assembled de novo separately using Trinity [65 (link)] (K-mer25[GitHub, San Francisco, CA, USA]). Trinity-generated assemblies were clustered based on sequence similarity. Transcripts were clustered using CD-HIT (cluster database at high identity with tolerance [GitHub]) at 95% identity and query coverage to reduce the redundancy without exclusion of sequence diversity. Clustered transcripts were used for further annotation.
+ Open protocol
+ Expand
10

RNA-seq Variant Calling Pipeline

Check if the same lab product or an alternative is used in the 5 most similar protocols
Sequence reads passing quality filter from Illumina RTA were first checked using FastQC [64 (link)] and then mapped to GENCODE (https://www.gencodegenes.org/) annotation database (V25) and human reference genome (GRCh38.p7) using Tophat2 [65 (link)] with a lenient alignment strategy allowing at most two mismatches per read to accommodate potential editing events. The mapped bam files were further QCed using RSeqQC [66 (link)]. Then, all samples were run through the GATK best practices pipeline of SNV calling (https://gatkforums.broadinstitute.org/gatk/discussion/3892/the-gatk-best-practices-for-variant-calling-on-rnaseq-in-full-detail) using RNA-seq data to obtain a list of candidate variant sites. All known SNPs from dbSNP (V144) [67 (link)] were removed from further analyses.
+ Open protocol
+ Expand

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Sign up for free.
Registration takes 20 seconds.
Available from any computer
No download required

Sign up now

Revolutionizing how scientists
search and build protocols!