The eFP Browser is implemented in Python and makes use of the Python Imaging Library (PIL) Build 1.1.5 (www.python.org ), which we modified to provide an optimized flood pixel replacement function called replaceFill, and other Python modules, as described on the eFP Browser development homepage. The inputs for the eFP Browser are illustrated in Figure 1 . A pictographic representation of the sample collection as a Targa-based image is required, as is an XML control file, shown in detail in Figure 1B . Two other inputs are a database of gene identifiers and their appropriate microarray element lookups and annotations, and a database of gene expression values for the given samples. In the case of the Arabidopsis, Cell and Mouse eFP Browsers , we have mirrored publicly-available microarray data from several sources – described in the Data Sources and subsequent two sections – in our Bio-Array Resource [10] (link). These inputs are used by the eFP Browser algorithm to generate an output image for a user's gene identifier.
The eFP Browser algorithm itself is programmed in an object-oriented manner. The main program, efpWeb.cgi, is responsible for the creation of the HTML code for the user interface and presentation of the output image. It calls on four modules to complete the task. These modules are 1) efp.py, which performs most of the functions for the generation of the output image, including the parsing of the XML control file, average and standard deviation calculations, fold-change relative to control value calculations, and image map HTML code; 2) efpDb.py, which connects to the gene expression, microarray element and annotation databases, and returns the appropriate values upon being called; 3) efpImg.py, which formulates the actual colour replace calls on the Targa input image; and 4) efpXML.py, which identifies the XML control files that are present in the eFP Browser's data directory. These are displayed to the user in the Data Source drop-down, thus obviating the need to have them hard-coded in the main efpWeb.cgi program.
In the case of theCell eFP Browser , data in the SUBA database indicate the presence of a given protein in a particular subcellular location, either based on computational methods or as molecularly documented by mass spectrometric analysis of subcellular fractions, GFP fusions etc. [11] (link). We have used a simple heuristic to turn these data into a confidence score for a given gene product's presence in a given subcellular compartment:
where
m = molecular method index of 5 possible methods
p = prediction algorithm index of 10 possible algorithms
s = weighting for molecular method = 1
s′ = weighting for prediction algorithm = 0.2
D = presence in the subcellular compartment for a given method or algorithm (1 or 0).
The maximum value the confidence score can be for a given compartment is 7 if all methods call a given gene product present in that compartment. While we have arbitrarily given a weighting to prediction algorithm calls for a particular subcellular compartment one fifth that for a molecular method, it would also be possible to incorporate the quality scores for each prediction algorithm instead.
The eFP Browser algorithm itself is programmed in an object-oriented manner. The main program, efpWeb.cgi, is responsible for the creation of the HTML code for the user interface and presentation of the output image. It calls on four modules to complete the task. These modules are 1) efp.py, which performs most of the functions for the generation of the output image, including the parsing of the XML control file, average and standard deviation calculations, fold-change relative to control value calculations, and image map HTML code; 2) efpDb.py, which connects to the gene expression, microarray element and annotation databases, and returns the appropriate values upon being called; 3) efpImg.py, which formulates the actual colour replace calls on the Targa input image; and 4) efpXML.py, which identifies the XML control files that are present in the eFP Browser's data directory. These are displayed to the user in the Data Source drop-down, thus obviating the need to have them hard-coded in the main efpWeb.cgi program.
In the case of the
where
m = molecular method index of 5 possible methods
p = prediction algorithm index of 10 possible algorithms
s = weighting for molecular method = 1
s′ = weighting for prediction algorithm = 0.2
D = presence in the subcellular compartment for a given method or algorithm (1 or 0).
The maximum value the confidence score can be for a given compartment is 7 if all methods call a given gene product present in that compartment. While we have arbitrarily given a weighting to prediction algorithm calls for a particular subcellular compartment one fifth that for a molecular method, it would also be possible to incorporate the quality scores for each prediction algorithm instead.