The Dashboard software defines a mapping from each subsystem (plot) to one or more pathways and/or GO terms. When the Dashboard displays each plot, it dynamically retrieves gene or metabolite lists for each plot from the PGDB for the current organism. For example, it issues PGDB queries to determine what genes (if any) exist in the current organism for the pathway(s) or GO term(s) associated with each plot. More specifically, Dashboard-panel gene groups are obtained from pathways in the PGDB via a Pathway Tools built-in query that returns all genes coding for enzymes catalyzing reactions within a specified metabolic pathway. Similarly, Pathway Tools provides a built-in query for obtaining all genes annotated to a given GO term. When displaying the window of regulators, the Dashboard issues a built-in Pathway Tools query for obtaining a list of all transcriptional regulators of a given gene.
PGDBs within the BioCyc collection are highly variable in terms of the completeness of their GO term annotations and regulatory interactions, but the Dashboard is best suited for use with PGDBs with significant numbers of GO terms and regulatory interactions. Table
Given a set of genes (the user could specify all genes, or a set of genes whose changes are computed to be statistically significant), the Dashboard computes an enrichment p-value for every subsystem using a Lisp implementation of Grossmann’s parent–child-union analysis, a variation of the Fisher-exact test in which the enrichment of a given subsystem is determined relative to its parent subsystem rather than to the entire population (10 (link)). An optional multiple-hypothesis correction (options are Bonferroni, Benjamini-Hochberg or Benjamini-Yekutieli corrections, with no correction being the default) may be applied. The enrichment p-value is then converted to an enrichment score, −log(P-value).
Experimental designs that the Dashboard should be appropriate for include time-course experiments, dose-response experiments, and experiments that vary growth conditions. The Dashboard performs well up to 20 columns of data, but the display becomes cramped; that effect will be lessened on larger monitors.
This paper uses the analysis of two datasets to illustrate the application of the Dashboard toolset: a genome-wide transcriptome analysis of Thalassiosira pseudonana, and an E. coli gene-expression analysis of a 10 min time course following a shift from anaerobic to aerobic growth conditions.
Mock et al. performed a genome-wide transcriptome analysis on T. pseudonana strain CCMP 1335 under five different environmental conditions: low nitrate (low N), low silicic acid (low Si), low iron (Low Fe), low temperature (4°C) and high pH (9.4), with nutrient-replete cultures serving as reference conditions (11 (link)). Cultures were maintained in natural seawater that had been autoclaved and supplemented with 2 × f/2 nutrients minus one of the limited nutrient (Si, Fe or N) at 20°C and 100 μmol of photons m−2s−1. F/2 provides the major nutrients including N, Si and P, as well as trace metals and vitamins (12 (link)). Alkaline pH condition was obtained by increasing the pH of 2 × f/2 seawater to 9.4 by adding 1M NaOH. Temperature limitation was achieved by transferring a culture maintained in nutrient-replete 2 ×f/2 seawater at 20°C to 4°C for 24 h (11 (link)). All limitation experiments were conducted in parallel with nutrient-replete cultures. Cells were harvested for RNA when the growth rate began to decrease significantly relative to the control cultures. Differentially expressed genes include those that have a Bayesian t-test P-value ≤ 0.05, and a ≥2-fold difference in mRNA levels with respect to the control samples. Data are available under GEO accession GSE9697.
Methods from von Wulffen et al. (13 ): Escherichia coli K–12 strain W3110 was used in this study. Cells were grown anaerobically in defined medium at pH7 and 37°C in a stirred 3-l bio-reactor until the culture reached an OD (600 nm) of 3. At that point, the first three replicate samples were drawn and aeration was started subsequently at 1 l/min. At 0.5, 1, 2, 5 and 10 min after the onset of aeration additional samples were drawn from the three replicates.
Analysis of von Wulffen et al. data performed for this publication: raw gene counts were obtained from the GEO database (accession GSE71562). Replicates were averaged and were next normalized using the TPM (Transcripts per Kilobase Million) approach (14 (link)). Genes that had zero counts in more than 15% of the samples were removed from further analysis; in addition, two genes (ssrA and rnpB) with high expression values that compressed the scales of two panels were removed; see