Table 1 illustrates the wide range of operations that BEDTools support. Many of the tools have extensive parameters that allow user-defined overlap criteria and fine control over how results are reported. Importantly, we have also defined a concise format (BEDPE) to facilitate comparisons of discontinuous features (e.g. paired-end sequence reads) to each other (pairToPair), and to genomic features in traditional BED format (pairToBed). This functionality is crucial for interpreting genomic rearrangements detected by paired-end mapping, and for identifying fusion genes or alternative splicing patterns by RNA-seq. To facilitate comparisons with data produced by current DNA sequencing technologies, intersectBed and pairToBed compute overlaps between sequence alignments in BAM format (Li et al., 2009 (link)), and a general purpose tool is provided to convert BAM alignments to BED format, thus facilitating the use of BAM alignments with all other BEDTools (Table 1). The following examples illustrate the use of intersectBed to isolate single nucleotide polymorphisms (SNPs) that overlap with genes, pairToBed to create a BAM file containing only those alignments that overlap with exons and intersectBed coupled with samtools to create a SAM file of alignments that do not intersect (-v) with repeats.

Summary of supported operations available in the BEDTools suite

UtilityDescription
intersectBed*Returns overlaps between two BED files.
pairToBedReturns overlaps between a BEDPE file and a BED file.
bamToBedConverts BAM alignments to BED or BEDPE format.
pairToPairReturns overlaps between two BEDPE files.
windowBedReturns overlaps between two BED files within a user-defined window.
closestBedReturns the closest feature to each entry in a BED file.
subtractBed*Removes the portion of an interval that is overlapped by another feature.
mergeBed*Merges overlapping features into a single feature.
coverageBed*Summarizes the depth and breadth of coverage of features in one BED file relative to another.
genomeCoverageBedHistogram or a ‘per base’ report of genome coverage.
fastaFromBedCreates FASTA sequences from BED intervals.
maskFastaFromBedMasks a FASTA file based upon BED coordinates.
shuffleBedPermutes the locations of features within a genome.
slopBedAdjusts features by a requested number of base pairs.
sortBedSorts BED files in useful ways.
linksBedCreates HTML links from a BED file.
complementBed*Returns intervals not spanned by features in a BED file.

Utilities in bold support sequence alignments in BAM. Utilities with an asterisk were compared with Galaxy and found to yield identical results.

Other notable tools include coverageBed, which calculates the depth and breadth of genomic coverage of one feature set (e.g. mapped sequence reads) relative to another; shuffleBed, which permutes the genomic positions of BED features to allow calculations of statistical enrichment; mergeBed, which combines overlapping features; and utilities that search for nearby yet non-overlapping features (closestBed and windowBed). BEDTools also includes utilities for extracting and masking FASTA sequences (Pearson and Lipman, 1988 (link)) based upon BED intervals. Tools with similar functionality to those provided by Galaxy were directly compared for correctness using the ‘knownGene’ and ‘RepeatMasker’ tracks from the hg19 build of the human genome. The results from all analogous tools were found to be identical (Table 1).