Login

Join for Free!
17212 members
table of contents table of contents

Biology Articles » Bioinformatics » Bioinformatics in microbial biotechnology – a mini review » Reconstructing metabolic pathways

Reconstructing metabolic pathways
- Bioinformatics in microbial biotechnology – a mini review

Identification of gene functionality has started a new level of bioinformatics research: automated reconstruction and comparison of pathways of newly sequence organisms [12,14,18,38,49,50,58,59,65]. There have been many efforts and approaches related to pathway reconstruction. The three major approaches can be classified as: (i) global network of reactions catalyzed by enzymes, (2) network of gene-groups connected through the reactions catalyzed by enzymes embedded in the gene-groups, (3) global modeling of chemical reactions in the microbial cells.

The first approach [49] uses the knowledge of known biochemical pathways and enzymes [9,33], identifies the enzyme function of new genes in a newly sequenced genome using BLAST based search or using pair-wise genome comparison of evolutionary close genomes [65], and matches the product and substrate of chemical reactions catalyzed by enzymes to build the network of reactions [18]. This approach is quite powerful. However, it has many drawbacks: (i) it can not disambiguate the exact position in pathways for homologous genes, (ii) it does not take into account genes occurring in the same pathway due to gene-grouping and co-transcription, and (iii) it does not take into account the reaction rate.

The knowledge of gene-groups [11,26] has been used to develop an integrated approach for reconstructing metabolic pathways [12,14,65]. In this approach there are four steps: (i) identifying the enzymes and their functions in a newly sequenced genome using ortholog analysis, (ii) identifying the co-transcribed gene-groups – groups of genes sharing a common promoter – by analyzing the promoter region of the genes, (iii) deriving the gene-groups by pair-wise comparison of newly sequenced genome with multiple genomes, and (iv) using biochemical knowledge of existing pathways and enzymes [9,33] to connect network of gene-groups. The intergenic distance – distance between the stop codons of the preceding gene and the start codons of the following gene – in co-transcribed gene-groups (possible operons) is generally less than 75 nucleotides except for the leading gene. By computationally comparing the intergenic distance most of these possible co-transcribed gene-groups are identified. However, the knowledge of co-transcribed gene-groups in itself is insufficient to identify pathways since (i) co-transcribed gene-groups may have missing genes due to conservative estimate of cutoff threshold, (ii) multiple adjacent co-transcribed gene-groups in the same pathway may be separated due to gene insertion/deletion caused by genome restructuring, and (iii) some of the regulating genes that regulate pathways and are in close proximity are not picked up. These three problems are reduced by taking union of genes in the same gene-group derived from multiple pair-wise genome comparisons with the newly sequenced genome. The overall gene-groups are identified by merging the information derived from promoter based analysis and pair-wise genome comparison analysis [14]. Since gene-groups in a pathway are scattered across the genome, the gene-groups are networked to each other by matching the biochemical product and substrates in the reactions catalyzed by the enzymes embedded in the gene-groups using enzyme databases [9,33]. This scheme improves the computational efficiency, reduces the ambiguity of homologous genes, and includes many regulatory genes involved with a pathway. However, this scheme does not model cell level behavior as the notion of reaction rate is missing.

The third approach [50,58] is based on modeling the biochemical reactions globally involving products, byproducts and the effect of cofactors on the reaction rate [59]. The model is based upon representing the network of metabolic reaction as a set of vector of reactions called extreme pathways that correspond to study state flux distribution in a metabolic network needed to synthesize target products. In this technique the whole network of pathways is modeled as a matrix where the rows are extreme pathways and columns represent specific reactions. This technique is useful to model the overall metabolic behavior within a microbial cell.

Current metabolic pathway techniques are limited by the available gene-functions from wet-laboratories. Another issue is that the identification of metabolic pathways is not sufficient unless the reaction rates and the effect of stress response over the reaction rates are known. While there have been recent approaches to model the reaction rate of metabolic pathways [59], the complete picture cannot be verified largely due to unavailability of gene-functions from wet-labs.


rating: 5.50 from 4 votes | updated on: 31 Oct 2006 | views: 890 |

Rate article:







excellent!bad…