The above discussion suggests that RNAP II, via the CTD, can play a significant role in stimulating splicing. But it is not clear precisely how it does so, or even what proteins directly interact with the enzyme to influence splicing. As mentioned above, SR proteins and snRNPs have been found to associate with RNAP II0 or phosphorylated GST-CTD, but it is unclear if these interactions are direct. Thus it is worth considering what molecules might serve to connect RNAP II0 with the splicing machinery. Although nothing definitive is known regarding such hypothetical protein linkers, several proteins, or protein families, merit some discussion (see Fig. 1).
A group of at least four proteins now known as SCAFs (SR-like CTD-associated factors) are reasonable candidates to function in coupling transcription and splicing. These proteins were initially discovered in a yeast two-hybrid screen for CTD-interacting proteins (Yuryev et al. 1996), and features of their primary structure suggest they could be involved in linking RNAP II to splicing. These include an RS-like domain and an RNA binding domain (RBD), which are similar to domains found in SR proteins. It is intriguing that RS domain-containing proteins were found to interact with the CTD, as Greenleaf (1993) had earlier predicted an RS domain-CTD interaction that would link transcription and RNA processing. This was certainly a prescient suggestion, despite the fact that the details seem to be incorrect: SCAFs interact with the CTD via a distinct domain (the CTD interaction domain; CID), not their RS-like region. A subsequent study showed that one SCAF, SCAF8, interacts only with phosphorylated CTD, or RNAP II0, and can be localized in cells to foci that overlap with sites of transcription and processing (Patturajan et al. 1998). These results together all support a role for SCAFs in linking RNAP II transcription to processing, but direct support for this hypothesis is not yet available.
Yeast contains a protein, called Nrd1, with similarity to SCAFs, including a CID, an RS-like region (which is uncommon in yeast) and an RBD (Steinmetz and Brow 1996, 1998). Nrd1 is an essential protein that appears to function in the nucleus and to influence accumulation of certain pre-mRNAs, perhaps at the level of transcription elongation or 3'-end formation. A recent genetic analysis has provided support for the idea that Nrd1 may serve to link transcription with some aspect of pre-mRNA processing (Conrad et al. 2000). A nrd1 ts allele was shown to be synthetically lethal with viable CTD truncation mutants, and genetic interactions were also observed with two genes encoding hnRNP-like proteins. However, it remains unclear exactly how Nrd1 functions, and there is no evidence that it is involved in splicing.
An interesting question is whether the link between transcription and splicing in fact extends to yeast. All the data supporting this interaction comes from studies in mammalian systems, and there are several factors suggesting yeast may not require such a coupling. For example, yeast genes are relatively simple and the vast majority lack intervening sequences, which when present are usually small and located near the 5' end of the transcript. Perhaps reflecting a role limited to transcription (and capping), the CTD is half the size of its mammalian counterpart. However, it is noteworthy that although only ~3% of yeast genes contain intervening sequences, ~30% of the total primary transcript pool contains introns (Ares et al. 1999; Lopez and Seraphin 1999). This is because intervening sequences are overrepresented in highly expressed genes, such as those encoding ribosomal proteins. Although the reasons for this are unclear, an intriguing possibility is that introns facilitate efficient mRNA synthesis, perhaps suggesting some sort of splicing-transcription link after all.
A second intriguing group of potential transcription-splicing coupling proteins are the TLS/FUS-related proteins, which also includes EWS and TAFII68. The genes encoding all three proteins are known to be involved in chromosomal translocations that are associated with different sarcomas and leukemias (Attwooll et al. 1999; Sjogren et al. 1999 and references therein). In all cases, chimeric proteins consisting of the amino terminus of the TLS/FUS-related protein fused to the DNA binding domain of a transcriptional activator are created (Delattre et al. 1992; Crozat et al. 1993). While it seems likely that oncogenic transformation results at least in part from the altered transcriptional properties of the chimeric transcription factors, properties of the TLS/FUS proteins themselves are consistent with roles in both transcription and splicing. The primary structure of the proteins indicates that they all contain an RBD in their central region and a so-called RGG domain at their carboxyl terminus, both of which are features of hnRNP proteins (for review, see Burd and Dreyfuss 1994). The TLS/FUS and EWS proteins have indeed been isolated in complexes containing RNA and hnRNP A and C proteins (Zinszner et al. 1994). Furthermore, yeast two-hybrid screens found that the amino terminus of a TLS/FUS-related protein (the region retained in the chimeric oncoproteins) interacts with splicing factor SF1 (Zhang et al. 1998) while the carboxy-terminal RGG domain interacts with SR proteins (Yang et al. 1998). Further supporting a possible role in splicing, transient overexpression of TLS/FUS can affect the relative accumulation of alternatively spliced mRNAs from a cotransfected reporter (Hallier et al. 1998; Yang et al. 1998). On the other hand, TAFII68 was isolated, as the name suggests, as a TATA binding protein (TBP)-associated factor; that is, as a component of transcription factor TFIID (Bertolotti et al. 1996). Although not present in stoichiometric amounts, both TAFII68 and TLS/FUS could be found in distinct TFIID complexes, and at least TAFII68 also copurifies extensively with RNAP II. Together, all these properties are consistent with the idea that TLS/FUS-related proteins function in the coupling of transcription and splicing. However, as with the SCAFs, direct support for this idea is lacking, and indeed biochemical evidence suggesting an alternative role for TLS/FUS in homologous recombination has been presented (Baechtold et al. 1999).
The human papillomavirus (HPV) E2 protein provides an example of a sequence-specific DNA binding protein that may also serve to link transcription and splicing (Lai et al. 1999). The E2 protein is well known to participate in control of viral transcription and replication, and contains well-conserved DNA binding and activation domains separated by a hinge region that is less well conserved between HPV subtypes. In certain types, though, the hinge consists of multiple RS dipeptide repeats, similar to SR proteins. One of these proteins, HPV-5 E2, interacts, via this RS domain-like region, with splicing factors, including SR proteins, and can colocalize in transfected cells with splicing factors. Importantly, full activation of reporter genes was found to require the RS-rich hinge region, and this hinge region-dependent activation reflected enhanced splicing of the reporter transcripts (Lai et al. 1999). HPV-5 E2 thus provides an example of a promoter-bound transcription factor that can stimulate both transcription and splicing.
A final possible linking protein we will discuss is known as p52. p75, a variant of p52 arising from alternative splicing, was initially discovered by copurification with the transcriptional coactivator PC4, and both proteins were themselves shown to be capable of functioning as coactivators in in vitro transcription assays (Ge et al. 1998a). Coactivator function was suggested to reflect the ability of the proteins to interact with both transcriptional activators and with general transcription factors. Remarkably, p52 was also shown to interact with the SR protein ASF/SF2 in vitro and in vivo, and was suggested to influence alternative splicing of a model pre-mRNA in in vitro splicing assays (Ge et al. 1998b). Based on these properties, it is conceivable that p52 not only helps link RNAP II transcription to splicing, but also might contribute to the promoter-specific effects on alternative splicing described above.