DNA base substitutions do not occur randomly (Graur and Li 2000). Instead, they may be clustered in hotspots, for example around methylated CG dinucleotides, or subject to more general biases such as the excess of transitions relative to transversions. In addition, local structural context may be important, with neighbouring bases interacting to favour some changes over others (Blake et al. 1992; Morton et al. 1997; Goodman and Fygenson 1998; Zavolan and Kepler 2001). However, many nonrandom patterns of sequence evolution remain unexplained. Here we explore how an abundant class of repetitive sequences, microsatellites, may influence the pattern of mutations in sequences that surround them.
Microsatellites are sequences of repeated 1–6-bp motifs that mutate primarily through the gain and loss of repeat units, in a process thought to depend on DNA replication slippage (Levinson and Gutman 1987; Tautz and Schlötterer 1994). Previous studies indicate that their flanking sequences evolve unusually and often contain mutated versions of microsatellites (Matula and Kypr 1999). Estimates of flanking sequence mutation rates vary greatly. Very slow evolution is suggested by sequence comparisons between distantly related species, where divergence rates may be as low as 0.016% to 0.1% per million years (Schlötterer et al. 1991; Rico et al. 1996; Zardoya et al. 1996). Elsewhere, pedigree studies suggest much higher rates and even hypermutability (Stallings 1995). There is also disagreement about trends in mutation rate, some studies indicating an increase towards the microsatellite (Blanquer-Maumont and Crouau-Roy 1995; Zardoya et al. 1996; Grimaldi and Crouau-Roy 1997; Brohede and Ellegren 1999) while others claim a more even distribution (Karhu et al. 2000).
To our knowledge, no one has yet conducted a systematic study of mutational biases operating around microsatellites. The direct study of naturally occurring mutations in flanking sequences is virtually prohibited by their slow rate of accumulation, and inferences based on comparisons between homologous microsatellite loci rely on small numbers of sequences. However, an indirect approach is possible, based on comparisons among very large numbers of microsatellite flanking sequences from the finished human genome. If microsatellites have little or variable influence on their flanking regions, among-locus similarities will be minimal or absent. Conversely, if microsatellites generate similar local mutation biases, nonhomologous loci should betray evidence of convergent evolution. With the publication of large blocks of sequence from the chimpanzee genome, one can extend this approach to ask questions about rate of divergence between homologous flanking sequences.
Here we use a combination of these indirect approaches to show that microsatellites appear to create regions around them in which both the rate and spectrum of mutations are modified.