such as "Introduction", "Conclusion"..etc
Since the divergence of the rodent and primate lineages, multiple substitutions might have occurred at the same site of a pair of mouse–human orthologs independently in two lineages. Had there not been a strong directionality of selection process(es) prevailing over the random mutational events, such multiple hits should have obscured the true pattern of substitution, if any, between such orthologous pairs. However, the present study has revealed that the nucleotide and amino acid substitution patterns in mouse–human orthologs have followed definite trends that are highly asymmetric and polarized to opposite directions in high- and low-GC groups, suggesting that indeed there has been a definite directionality in gene/protein evolution towards increasing compositional divergence in human protein-coding regions compared with that in mouse protein-coding regions or towards decreasing compositional divergence in mouse protein-coding regions compared with that in human protein-coding regions. It is true that the GC content shows evolutionary stability between mouse and human, i.e. orthologs have similar GC contents in two species, but among the high-GC orthologs, human proteins are slightly higher in GC content than their mouse orthologs, whereas among the low-GC orthologs, human proteins are slightly higher in AT content than their mouse counterparts.
A question may be raised at this point: why, of all mammalian species, only mouse and human were chosen as the species of study in the present report. The reason is as follows: initially we intended to analyze the sequence divergence patterns between the orthologous coding regions of human, chimpanzee, and rhesus monkey. However, the numbers of nonsynonymous substitutions between two orthologs of any two primate species were often too low to reveal any significant statistical trend. Therefore, we have decided to analyze the trends in substitution patterns between a rodent and a primate species, mouse and human have been chosen as the representative species of the two lineages.
As already mentioned in Section 2, the trends reported here are robust enough to be valid for any subset of the total datasets of orthologous sequences. Any trend in amino acid/nucleotide replacement between the pairs of orthologs of a particular dataset remains invariant, in general, when a subset of sequences are chosen randomly from that particular dataset. This indicates that same trends are usually followed individually by each pair of orthologs in a particular group (high-, medium-, or low-GC group).
The trends in amino acid and nucleotide replacement patterns also remained same when the orthologous sequences were classified in high-, medium-, and low-GC groups on the basis of the GC3 content of mouse genes instead of human genes. The same previous directionality was observed for high- or low-GC groups, i.e. GC content either increase in human genes relative to mouse or decrease in mouse genes relative to human for the high-GC group, whereas for low-GC group, either there is relative decrease in GC content in human genes compared with mouse gene or relative increase in GC content in mouse genes compared with human gene. This was, however, expected as the two genome sequences exhibit a one-to-one correspondence in their local GC content.
The only significant trend common in all three groups of orthologs is (Asp)Mouse (Glu)Human. Surprisingly, the value of RDE is almost same in all three groups and the trend has also been exhibited by the subsets chosen randomly from the whole dataset of any particular compositional group. This indicates that this trend, in general, does not alter with the compositional bias or functional characteristics of the genes. In accordance with this, average frequency of Glu (7.01% for mouse and 7.11% for human) is significantly higher in human (p –5) and that of Asp (4.90% for mouse and 4.81% for human) is significantly higher in mouse (p –5). The structural consequence of this trend is, however, not clear.
No significant differences could be observed between the synonymous or nonsynonymous substitution rates in three groups of orthologs under study. This suggests that although the directionality of evolution in orthologs of two extreme GC compositions is oppositely polarized, the rate at which they evolve is almost same in both cases.
In a nutshell, the present study indicates that in comparison with mouse, the coding regions of the human genome have experienced an expansion, not shrinkage, in intra-species heterogeneity in local GC content. This observation, however, does not warrant the relative expansion of the human GC islands as a whole, since it would depend not only on the evolutionary trends of the coding region, but also on those of the noncoding regions. One should also remember that a relative increase in GC heterogeneity in human orthologs compared with mouse orthologs not necessarily implies an absolute increase in GC heterogeneity in human coding regions with evolution. In absolute sense, both human and mouse might have evolved towards decreasing compositional heterogeneity, the rate of decrease in heterogeneity being less in human than in mouse, or alternatively, both the species might be evolving towards increasing intra-species inhomogeneity, the rate of increase being higher in human relative to mouse.
Council of Scientific and Industrial Research (Project No. CMM 0017 to C.D and S.G); Department of Biotechnology, Government of India (BT/BI/04/055-2001 to S.K.B and S.P).
Supplementary data are available online at http://www.dnaresearch.oxfordjournals.org
Acknowledgements We are grateful to Dr. A. Pan, Indian Association for the Cultivation of Science, Kolkata, India, for critical reading of the manuscript.
Footnotes * To whom correspondence should be addressed. Tel. +91 33-2473-3491. Fax. +91 33-2473-0284. E-mail: email@example.com
Edited by Hiroyuki Toh
Enter the code exactly as it appears. All letters are case insensitive, there is no zero.