A set of command line utilities, summarized in Table 1 , are provided to create and analyze multiple genome alignments in HAL. Importers are provided for UCSC’s MAF, which is a standard with its own rich set of filters and converters (ex. to FASTA) (Blanchette et al., 2004 (link)) and Cactus (Paten et al., 2011 (link)), which has been designed specifically to output HAL. MAF files can be quickly produced from HAL graphs for given subgraphs with respect to arbitrary references to be compatible with existing browsers and tools. The memory usage of each tool is configurable via its command line options.
Mutations can be identified along branches and output to tab delimited annotation files using the halBranchMutations tool. A cycle decomposition of the breakpoint graph structure allows rearrangements, such as duplications, inversions and transpositions to be reported in addition to substitutions, insertions and deletions. Small indels (determined by a provided threshold) can be nested within larger rearrangements to avoid overcounting in these cases. Patterns of conservation within a target sequence can be aggregated using the halLiftover tool, which maps coordinates in a BED file to an arbitrary target in the alignment. This utility provides a general strategy to efficiently liftover and project any comparative genomics information into the coordinate system of any reference genome. Excellent software packages are available for sorting, combining and querying BED files (Quinlan and Hall, 2010 (link); Neph et al., 2012 (link)) and can be combined with the aforementioned tools to create powerful analysis pipelines for multiple genome alignments.
HAL tools summary
Tool | Description |
---|---|
halStats | Print summary statistics of HAL file |
halSummarizeMutations | Print mutation summary for given subgraph |
halBranchMutations | Generate BED file(s) of mutations for a branch |
halLiftover | Map BED coordinates between genomes |
hal2maf/maf2hal | Convert to and from MAF |
cactus2hal | Convert from Cactus |
Full text: Click here