Custom ChIP-seq Peak Calling for Bacteria

Sequences were aligned to the MG1655 genome (NC_000913.2) using the CLC Genomics Workbench. Mapped reads were piled up and written to a .gff file using a custom Python script and viewed in SignalMap (Nimblegen). All ChIP-seq images presented in this study are captured from SignalMap and manipulated in the image editing software GIMP to highlight baselines (zero reads) and fill gaps in the data resulting from image artifacts.
Almost all ChIP-seq analysis programs have been designed and optimized for eukaryotic ChIP-seq data and, in our experience, do not perform well with bacterial ChIP-seq data. We have generated custom Python scripts to identify peaks in bacterial ChIP-seq data. First, all datasets were normalized to 100 million reads. Pairs of replicate datasets were considered together. For each replicate dataset in the pair, an appropriate threshold was determined. The plus and minus strands were considered separately. For the first replicate, for a given strand, a value T₁ was selected as the threshold. For the second replicate, a value T₂ was selected as the threshold. Values for T₁ and T₂ were considered between 1 and 1000. For each combination of values for T₁ and T₂, the number of genome positions with values ≥T₁ in the first replicate and with values ≥T₂ in the second replicate was determined. The false discovery rate was estimated using the null hypothesis that no regions are enriched. The combination of thresholds yielding the highest number of true positive positions, with an estimated false discovery rate of less than 0.01, was selected. Once T₁ and T₂ were chosen, peak calling was performed as previously described (Supplementary Material of [54] (link)). Briefly, a region was identified as a peak if both replicates showed enrichment above the corresponding thresholds for each strand. For a peak to be called there must be a peak on the plus strand within a threshold distance of a peak on the minus strand, as previously described (Supplementary Material of [54] (link)). To identify regions of artifactual enrichment, peaks identified in tagged strains were compared to those called in a control ChIP-seq experiment using an untagged strain (DMF35). For each factor, the calculated T values were adjusted to reflect the total number of reads in control experiment replicates and then applied for peak calling in the controls. Any regions for which a peak was called in the true ChIP-seq experiment and in the untagged control experiment within 50 bp of each other were considered potential artifacts and excluded from further analysis.

Free full text: Click here

Fitzgerald D.M., Bonocora R.P, & Wade J.T. (2014). Comprehensive Mapping of the Escherichia coli Flagellar Regulatory Network. PLoS Genetics, 10(10), e1004649.

Publication 2014

Bacterial Baselines Chip seq Eukaryotic Genome Gimp Python Replicate Strain

Corresponding Organization : New York State Department of Health

Top 5 similar protocols

Protocol cited in 9 other protocols

Variable analysis

independent variables

Presence or absence of tagged protein (in the ChIP-seq experiment)

dependent variables

Enrichment of genomic regions above specified thresholds (T1 and T2) in the ChIP-seq experiment

control variables

Genome sequence and annotation (MG1655 genome, NC_000913.2)
Mapping of sequencing reads to the reference genome using CLC Genomics Workbench
Normalization of all ChIP-seq datasets to 100 million reads
Considering plus and minus strands separately for peak calling
Adjusting thresholds (T1 and T2) to maintain a false discovery rate (FDR) of less than 0.01
Comparing peaks identified in tagged strains to those in an untagged control strain (DMF35) to exclude potential artifacts

positive controls

Not explicitly mentioned

negative controls

ChIP-seq experiment using an untagged strain (DMF35)

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!