Mammalian genomes are highly heterogeneous in base composition. These are composed of long stretches of DNA with distinct GC composition, commonly known as the isochore structures1–4 or GC-content domains.5 The local GC composition correlates with a number of important genomic features such as gene density, gene length, patterns of gene expression, repeat element distribution, recombination rate etc.6–11 Evolutionary stability of the GC-content distribution has been demonstrated for mice and humans on a genome-wide level.12 The GC-rich sequences from one genome were demonstrated to be GC rich in the other genome and vice versa. Finding such one-to-one correspondence between the local GC distribution patterns in mouse and human was, however, not trivial. Since the divergence of the rodent and primate lineages at around 84–121 million years ago,13,14 multiple substitutions might have occurred at the same sites of a pair of mouse–human orthologs independently in two lineages and if there had not been a strong directionality of the selection process(es) prevailing over the random mutation and fixation, such multiple substitutions should have randomized the local GC distribution patterns in two genomes. Invariance of the overall patterns of GC distribution along the chromosomes of mouse and human, therefore, suggests that there might be some well-defined trends in the nucleotide and/or amino acid substitution patterns across these two species. The present study was designed to determine such trends, if any.
A number of efforts have been made earlier to determine the evolutionary trends in mammalian genomes, but no definite conclusion could be reached. On the basis of the analysis of orthologous gene sequences from closely related species, it has been proposed that GC-rich regions of primate and cetartiodactyl genomes are becoming GC poorer, i.e. GC-rich isochores are now vanishing in these lineages.15–18 Alvarez-Valin et al.,19 however, described the ‘vanishing isochores’ effect as an artifact created due to inaccurate reconstruction of ancestral GC levels in such studies,15 offering an evidence for an AT substitution bias within the repetitive elements of mammals. On the contrary, the maximum parsimony analysis conducted by Gu and Li20 advocated for recent enrichment of the GC content of GC-rich genes in some genomes, e.g. the rabbit. Therefore, the direction(s) of evolution of mammalian genes is a matter of conjecture. Did mammalian genes of varying GC bias follow distinct evolutionary trajectories, and if yes, to what extent could they influence the evolution of encoded proteins? In an attempt to address these questions, the present study carried out a genome-scale analysis of the trends in nucleotide and amino acid substitutions between human and mouse orthologous pairs of varying GC content. The analysis showed that indeed there exist definite trends not only in nucleotide, but also in amino acid substitution patterns between mouse and human orthologous pairs, and that these trends are, in general, highly asymmetric and polarized to the reverse directions in high-GC and low-GC sets of orthologs in such a way that in course of evolution, the compositional heterogeneity has been significantly enhanced in coding regions in human compared with that in mouse.