How Widespread Is the Regulation of Translation by miRNA?
With plausible parameters, we have predicted that close to 9% (2,273 out of 23,531) of all mammalian genes have more than one miRNA target site in their 3′ UTRs, with 1,314 being stronger candidates with more than two target sites. This could well be an underestimate of the total number of genes subject to miRNA regulation, as we have used a conservative conservation filter. On the other hand, not all predicted miRNA–mRNA pairs would have a biological consequence unless both miRNA and mRNA are expressed at the same time in the same cell and at sufficient concentration. The human genome has about 250 miRNA genes, compared to about 35,000 protein genes. Thus, the the determination that about 1% of genes (miRNAs) control the expression of more than 10% of genes is a reasonable first order estimate. It is currently not known if any miRNAs control the expression of miRNA genes, i.e., the progression from miRNA transcript to mature miRNA.
How Conserved in Evolution Are miRNA Targets?
As many miRNA sequences are detectably conserved across large evolutionary distances, they must be subject to strong functional constraints. These constraints are unlikely to come from single-site interactions with the target, as experimentally validated animal miRNAs rarely have perfectly matched target sites. Plausibly, the evolution of miRNAs is constrained by functional interactions with multiple targets. As a consequence, any compensatory mutation in the miRNA in response to mutations in a target site would be disruptive to the miRNA's interaction with other target sites. Co-evolution of the miRNA sequence and all of its target sequences is therefore a rare event. With these assumptions, the constraints on the local mRNA sequence of individual target sites are weaker than those on the miRNA sequence. We were therefore surprised to observe a substantial number of cases (28.6% of the 2,273 targets) with 100% conservation of target site sequence and with the target sites being within ten nucleotides of each other on the globally aligned UTRs of orthologous genes between mammals.
Lacking more detailed knowledge of miRNA evolution, we draw two operational conclusions. (1) Conservation of target site sequence and position is a practical information filter for predicted target sites, reducing the rate of false positives. (2) It is very likely that new miRNAs have continuously appeared in evolution (Lai 2003) at some non-negligible rate and that the set of targets for any given miRNA has lost or gained members, even between species as close as human and mouse. It is therefore important to develop prediction tools that do not rely on conservation filters or at least allow us to make them weaker. Work on this is in progress.
Multiplicity and Cooperativity
Regulation by miRNAs is obviously not as simple as one miRNA–one target gene, as perhaps the early examples (lin-4 and let-7) seemed to indicate. The distribution of predicted targets reflects more complicated combinatorics, both in terms of target multiplicity (more than one target per miRNA) and signal integration (more than one miRNA per target gene).
The distribution of the number of target genes (and target sites) per miRNA is highly nonuniform, ranging from zero for seven miRNAs to 268 for let-7b, with an average of 7.1 targets per miRNA. It is difficult to describe in detail, beyond the examples discussed in this text and beyond the annotation of target genes in Figure 2 and Table S3, which specific processes appear to be regulated by each miRNA or each set of co-expressed miRNAs. Groups of targets may reflect a reaction, a pathway, or a functional class (see Results). Although all miRNA–target pairs are subject to the condition of synchrony of expression, it is likely that typically one miRNA regulates the translation of a number of target messages and that, in some cases, the target genes as a group are involved in a particular cellular process. This was already known for the case of lin-4 (Ambros 2003).
The number of miRNA target sites per gene is also nonuniform, with a mean of 2.4. Although we do list target genes with single miRNA sites, there is increasing evidence that, in general, two or more sites are needed in the context of repression of translation. Although the details of these distributions (see Figure 2 and Table S3) depend on technical details, such as uniform cutoff for all miRNAs and evaluation in terms of a particular, imperfect scoring system, the general features of the distributions (see Figure 3) may be generally valid.
We conclude that multiplicity of targets and cooperative signal integration on target genes are key features of the control of translation by miRNAs. Neither multiplicity nor cooperativity is a novel feature in the regulation of gene expression. Indeed, regulation by transcription factors appears to be characterized, at least in eukaryotes, by analogous one-to-many and many-to-one relations between regulating factor and regulated genes (Kadonaga 2004). We are, of course, aware that the control cycles and feedback loops involving miRNAs cannot be adequately described without more detailed knowledge of the control of transcription of miRNA genes, about which little is known at present.
Mechanisms of miRNA Action
The role of a few animal miRNAs as posttranscriptional regulators of gene expression and, in particular, as inhibitors of translation is well established. However, the molecular mechanism of action is not well understood. Posttranscriptional control of protein levels can be achieved, for example, by cleaving the mRNA, by preventing RNP transport to ribosomes, by stalling or otherwise inhibiting translation on ribosomes, or by facilitating the formation of protein complexes near ribosomes that degrade nascent polypeptide chains. What do our results imply regarding the mechanism of action?
In analogy to plant miRNAs that have near perfect sequence complementarity and facilitate mRNA degradation, our predicted targets with near perfect complementarity between miRNA and mRNA plausibly are involved in mRNA cleavage (e.g., miR-196 and miR-138; see Results). Most of these would involve single target sites. In the case of Hox-B8, cleavage has been experimentally shown in mammalian cells (Yekta et al. 2004). We estimate that fewer than 5% of miRNA targets are cleaved as a result of miRNA binding.
Multiple target sites of lesser complementarity are consistent with RNP formation leading to translational inhibition, not mRNA degradation. Although we did predict single miRNA target sites for some genes, most target genes have multiple sites, indicating that cooperative binding (Doench and Sharp 2004) may be essential for formation of inhibitory RNP complexes.
An interesting and somewhat paradoxical feature is seen with mRNAs bound by FMRP, some of which increased and some of which are decreased in polysome fractions in FMRP knock-out mice (Brown et al. 2001). We see no bias in which of these two sets is most enhanced as predicted miRNA targets. This ambiguity not only raises questions about details of FMRP regulation but also raises the possibility that miRNA targets may not always be translationally repressed and may instead be translationally enhanced.
Improvement of Prediction Rules
Current methods for predicting miRNA targets rely on conservation filters to reduce noise. Although the miRNA–mRNA pairings of experimentally validated targets were carefully used to define prediction rules (Enright et al. 2003; Lewis et al. 2003; Stark et al. 2003), the information content in sequence match scores and free energy estimates of RNA duplex formation appears to be low. What is missing? Perhaps the fine details of experimentally proven target site matches are incorrect, although in some experiments mismatches and insertions have been tested. More plausibly, the rules do not yet capture additional functionally relevant interactions of miRNAs, such as in maturation and transport. Such additional interactions remain to be described in molecular detail, such as interactions with the small RNA processing machinery (Drosha and Dicer) and with the components of RNPs (AGO and FMRP). A first step in this direction is the very recent analysis of the crystal structure of a PAZ domain of a human Argonaute protein, eIF2c1, complexed with a 9-mer RNA oligonucleotide in dimer configuration, which may represent three-dimensional interactions for the 3′ end of a miRNA (and siRNA) complexed, e.g., with Dicer or AGO (Ma et al. 2004). In this structure, each PAZ domain makes close binding contact with nine nucleotides of a single-stranded RNA. The two 3′ terminal nucleotides bind in a pocket through RNA backbone and other contacts. The remaining seven nucleotides bind PAZ through a series of backbone contacts such that nucleotides 3 to 9 are in an RNA helical conformation with bases exposed for base pairing to the second single-stranded RNA. If a 20–21-nt single-stranded RNA is bound to a PAZ domain in the same fashion, the 5′ end would be free for other interactions, such as binding to another protein domain in the RISC or base-pairing to mRNA. The conformational entropy that results when the 3′ end binds to PAZ, because the RNA helix is pre-formed, is consistent with weaker base pairing between miRNA and mRNA at the 3′ end of the miRNA, and stronger base pairing at the 5′ end. The dimeric structure of the PAZ domain (Ma et al. 2004) also raises the tantalizing possibility of cooperative binding of a dimer of two miRNA–PAZ combinations to two target sites on one or more mRNAs. In such an arrangement, seven residues at the 3′ ends of the two miRNAs (residues 3–9, but not the terminal two nucleotides) are paired in antiparallel fashion, with near perfect complementary pairing.
As more details of molecular contacts become available, prediction rules will evolve and improve in accuracy. The following elements are worth considering in the next generation of target prediction rules: (1) details of strand bias as deduced from siRNA experiments (Khvorova et al. 2003), (2) contribution of sequences outside of the mRNA target sites, (3) refinement of position-dependent rules, including different gap penalties for the mRNA and the miRNA, (4) energetics of miRNA–protein binding, starting with PAZ domain interaction, and (5) translation of systematic mutational profiling experiments into scoring rules (Doench and Sharp 2004).
Principles of Regulation by miRNAs
Although the predicted targets are subject to error (see estimate of false positives) and the prediction rules in need of improvement, several general principles of gene regulation by miRNAs are emerging. (1) Except in cases where a highly complementary match causes cleavage of the target message, miRNAs appear to act cooperatively, requiring two or more target sites per message, for either one or several different miRNAs. (2) Most miRNAs are involved in the translational regulation of several target genes, which in some cases are grouped into functional categories. (3) miRNAs carried in the context of RNPs appear to be sequence-specific adaptors guiding RNPs to particular target sequences. miRNA regulation of cellular messages may therefore range from a switch-like behavior (e.g., cleavage of mRNA message) to a subtle modulation of protein dosage in a cell through low-level translational repression (Bartel and Chen 2004).
These aspects of miRNA regulation complicate the design of experiments aiming at testing target predictions, or, more generally, at discovering biologically meaningful targets. Straightforward experiments that test one target site for one miRNA on one UTR will not be able to disentangle the effects of multiplicity or cooperativity. Tests for multiple sites on one UTR for one miRNA capture aspects of cooperativity (Doench and Sharp 2004), but still do not capture signal integration by diverse miRNAs. The most complicated situation is one in which multiple miRNAs affect multiple genes in combinatorial fashion, with fine-tuning depending on the state of the cell. We look forward to the results of ingenious experiments designed to deal with the complexity of miRNA regulation.
The results of this genome-wide prediction for mammals and fish are meant to be a guide to experiments that will in time elucidate the genetic control network of regulators of transcription, translation/maturation, and degradation of gene products, including miRNAs.