A major challenge in microarray and, in general, in any genome-scale experiment is to provide a functional explanation that links the results found at molecular level to the macroscopic observation or to the hypothesis that generated the experiment. This is commonly achieved by means of ‘functional enrichment’ analysis that first selects genes of interest based on the experimental values (e.g. genes differentially expressed between patients and healthy controls) and then studies the enrichment in functional terms (e.g. gene ontology —GO— annotations) in them (Al-Shahrour et al., 2004; Khatri and Draghici, 2005). Conceptually newer approaches avoid the first step of gene selection, where much information is lost because the functional interactions between genes are ignored (Dopazo, 2006), and directly focus on functionally related blocks of genes. Thus, functional profiling methods such as GSEA (Mootha et al., 2003) or FatiScan (Al-Shahrour et al., 2005) report blocks of genes belonging to different functional categories (GO, KEGG pathways, etc.) displaying a cooperative significant over- or under-expression when comparing two classes of microarray experiments. Genes can be grouped in many different ways that contain some biological or functional significance by using different repositories or information sources. To this end information coming from GO, KEGG pathways, Swissprot keywords, chromosomal position, Interpro functional motifs, transcription factor binding sites, etc. has been used for the functional profiling of microarray experiments (Al-Shahrour et al., 2006).
Text-mining methods (Krallinger and Valencia, 2005) offer the possibility of extracting different functional aspects of the genes beyond the ones covered by the ‘traditional’ repositories (GO, KEGG, etc.) that can be further used for functional profiling purposes. We present two tools that use functional terms (essentially chemical and clinical terms) obtained using text-mining techniques which can be used within a statistical framework that covers both types of tests previously mentioned: tests of functional enrichment in pre-selected sets of genes (Marmite tool) or tests for blocks of functionally related genes (MarmiteScan tool).