Login

Join for Free!
17217 members
table of contents table of contents

Comparative analysis of conserved elements between duplicated genes provides a powerful approach …


Biology Articles » Evolutionary Biology » Comparative Genomics » Comparative genomics using Fugu reveals insights into regulatory subfunctionalization » Background

Background
- Comparative genomics using Fugu reveals insights into regulatory subfunctionalization

Gene duplication is thought to be a major driving force in evolutionary innovation by providing material from which novel gene functions and expression patterns may arise. Duplicated genes have been shown to be present in all eukaryotic genomes currently sequenced [1] and are thought to arise by tandem, chromosomal or whole genome duplication events. Unless the duplication event is immediately advantageous (for example, by gene dosage increasing evolutionary fitness), the gene pair will exhibit functional redundancy, allowing one of the pair to accumulate mutations without affecting key functions. Because deleterious mutations are thought to occur much more commonly than neutral or advantageous ones, the classic model for the evolutionary fate of duplicated genes [2,3] predicts the degeneration of one of the copies to a pseudogene as the most likely outcome (a process known as non-functionalization). Less commonly, a mutation will be advantageous, allowing one of the gene duplicates to evolve a new function (a process known as neo-functionalization). Therefore, the classic model predicts that these two competing outcomes will result in the elimination of most duplicated genes. However, several studies suggest that the proportion of duplicated genes retained in vertebrate genomes is much higher than is predicted by this model [4-6]. This has led to the suggestion of an alternative model whereby complementary degenerative mutations in independent subfunctions of each gene copy permits their preservation in the genome, as both copies of the gene are now required to recapitulate the full range of functions present in the single ancestral gene. This was formalized in the Duplication-Degeneration-Complementation (DDC) model [7] in a process referred to as subfunctionalization.

The key novelty of the DDC model is that, rather than attributing different expression patterns of duplicated genes to the acquisition of novel functions, they are attributed to a partial (complementary) loss of function in each duplicate. In combination they retain the complete function of the pleiotropic original gene, but neither of them alone is sufficient to provide full functionality. For this model to be viable, the subfunctions of the gene are required to be independent so that mutations in one subfunction will not affect the other. The modular nature of many eukaryotic protein-coding sequences as well as cis-regulatory modules (CRMs), such as enhancers or silencers [8], means both can act as subfunctions or components of subfunctions of the gene in subfunctionalization. CRMs are cis-acting DNA sequences, up to several hundred bases in length, thought to be composed of clustered combinatorial binding sites for large numbers of transcription factors that together actuate a regulatory response for one or more genes [9]. The larger number of independently mutable units represented by CRMs, the small size and rapid turnover of transcription factor binding sites, as well as observations that, for many gene duplicates, changes that occur between paralogs are due to changes in expression rather than protein function has led a number of researchers to emphasize that important evolutionary changes might occur primarily at the level of gene regulation [10,11]. Consequently, subfunctionalization is thought most likely to occur by complementary degenerative mutations within regulatory elements.

Teleost fish provide an excellent system to study the DDC model in vertebrates due to the presence of extra gene duplicates that derive from a whole genome duplication event early in the evolution of ray-finned fishes 300-350 million years ago [12-17] This provides the opportunity for comparative analyses of gene duplicates in fish against a single ortholog in tetrapod lineages such as mammals. In particular, for analyses involving important developmentally associated genes, these 'single copies' represent as close as possible the ancestral gene from which the fish duplicates descended, since such genes are often highly conserved in sequence and function throughout vertebrates. We therefore refer to fish-specific duplicate genes as 'co-orthologs' (a term previously used in [18]) as each copy is co-orthologous to the single homolog in tetrapods.

A number of studies on fish duplicated genes have identified cases of subfunctionalization at both the regulatory and protein level. For instance, analysis of the synapsin-Timp genes in the pufferfish Fugu rubripes identified a case of protein subfunctionalization where two isoforms of the SYN gene expressed in human are expressed as two separate genes in Fugu [19]. A number of functional studies on the shared and divergent expression patterns of developmental co-orthologs in fish have also been carried out, for example, eng2 [20], sox9 [18] and runx2 [21]. In each case, partitioning of ancestral expression domains for each co-ortholog compared to the single (ancestral representative) gene in mammals was observed via gene expression studies, supporting a process of regulatory subfunctionalization along the lines of the DDC model. Work on identifying the regulatory elements involved has so far been limited to those responsible for divergent expression within the well-studied Hox genes. Santini et al. [22], through comparison to the single tetrapod Hox cluster, identified a number of conserved elements in fish-specific Hox clusters. These appeared to be partitioned between clusters, suggesting they may be responsible for their divergent expression. In addition, the zebrafish hoxb1a and hoxb1b genes, co-orthologs of the HOXB1 gene in mammals and birds, were found to exhibit complementary degeneration of two cis-regulatory elements identified upstream and downstream of the gene, consistent with the DDC model [23]. Similarly, Postlethwait et al. [24] carried out a comparative genomic analysis of the regions surrounding two zebrafish co-orthologs, eng2a and eng2b, against the single human ortholog EN2 and found one conserved non-coding element partitioned in each copy, together with a number of elements conserved in both. Both co-orthologs have overlapping expression in the midbrain-hindbrain border and jaw muscles, but eng2a is expressed in the somites and eng2b is expressed in the anterior hindbrain (both of which are expression domains found in the single mammalian ortholog). Hence, according to the DDC model, they hypothesized that sequences conserved in both co-orthologs represent regulatory elements responsible for overlapping expression domains, whilst conserved sequences specific to each gene are candidates for regulatory elements that drive expression to domains present in the single mammalian ortholog but now partitioned between co-orthologs. Despite these isolated examples, evidence for the DDC model, by way of identifying the regulatory elements responsible, remains limited.

Comparison of non-coding genomic sequence across extreme evolutionary distances such as that between fish and mammals to identify regions that remain conserved has proved powerful in identifying sequences likely to be vertebrate-specific distal CRMs (see [25] for a review). Fugu-mammal conserved non-coding elements (CNEs), identified genome-wide, cluster almost exclusively in the vicinity of genes implicated in transcriptional regulation and early development (termed trans-dev genes) with little or no conservation in non-coding sequence outside of these regions; a finding confirmed by a number of recent studies [25-31]. Furthermore, a majority of those CNEs tested in vivo drive expression of a reporter gene in a temporal and spatial specific manner that often overlaps the endogenous expression pattern of the nearby trans-dev gene, confirming this association and their likely role as critical CRMs for these genes [26,29,32-36]. The tight association of CNEs with trans-dev genes is likely the result of the fundamental nature of developmental gene regulatory networks involved in correct spatial-temporal patterning of the vertebrate body plan [26,37].

Fugu-mammal CNEs, enriched for putative CRMs, therefore provide an excellent class of sequences through which to test the DDC model further. In addition, a study has found that at least 6.6% of the Fugu genome is represented by fish-specific duplicate genes [15], making Fugu an attractive genome in which to identify and analyze regulatory elements involved in subfunctionalization of fish co-orthologs. Transcription factors and genes involved in development and cellular differentiation appear to be overrepresented within duplicated genes in fish genomes [38], improving the chances of identifying suitable candidates. Here, by taking an approach similar to Postlethwait et al. [24], we carried out alignments of genomic sequence around seven pairs of Fugu developmental co-orthologs against a number of single mammalian orthologous regions in order to investigate whether differential presence of conserved elements between co-orthologs is consistent with the DDC model of regulatory subfunctionalization.



rating: 8.00 from 5 votes | updated on: 20 Jul 2007 | views: 405 |

Rate article:







excellent!bad…