Using database sequence similarity searches coupled with phylogenetic analysis, we were able to unite the circularly permuted GTPases into a family that we have named YRG for YlqF Related GTPases [see Additional files 1 and 2]. The YlqF protein family represents the largest subfamily of YRG expansion in eukarya, which is potentially involved in ribosome biogenesis.
Phylogenetic analysis defines ten GTPase subfamilies with a global phyletic distribution compatible with their presence in the last universal common ancestor (LUCA) of extant life forms [22]. An emerging concept suggests that these universal GTPases are necessary either for ribosome function or for transmitting information from the ribosome to downstream targets to generate specific cellular responses. These are associated with translation and include four translation factors, two OBG-like GTPases, the two signal-recognition-associated GTPases, the MRP subfamily of MinD-like ATPases and the YRG family. Here we have defined the YRG family for the first time as a eukaryotic expansion of the original Yawg/YlqF family [22] tightly coupled to the evolution of compartmentalization.
The YRG family was originally defined as a particular class of GTPases showing a circularly permuted structure, with the four GTPase motifs reorganized as G4 followed by G1, G2 and G3 (Figure 1A). This circular permutation is unique in the GTPase superfamily. However, we have shown that this inverted structure does not seem to affect GTPase activity or folding, in agreement with other studies [31,39]. Moreover, regarding the potential function of this family, it has been pointed out that most YRG members bind to the ribosome [YjeQ, [31]], are involved in the maturation of ribosomes or mitoribosomes [24,28,29,2], localize to compartments related to rRNA maturation [NGP, [1,39]], and are essential proteins (see Figure 3C and Additional file 2). Altogether, this indicates that YRG members have an essential role in ribosomal assembly.
Strikingly, we could find a member of the YRG family for every cellular compartment linked to ribosomes, including the chloroplast (Figure 1C), correlating with the expansion of the eukaryotic cell (Figure 1C). According to the phylogenetic tree of the family, the cytosolic form block (LSG1) is distinct from the nuclear form blocks (NOG2, NGP1, YawG), which later expanded into a nucleolar form (NUG1), in parallel with the incorporation of members upon engulfment of the future mitochondria (MTG1) that cluster within the YlqF branch as well as the future chloroplast (ChYlqF). Other events within the YlqF family included the appearance of a second cytosolic form upon speciation of the coelomates (GNL1), which may have had an equivalent in the plant lineage, since we observed a form of Lsg1 in A. thaliana (Figure 1B). Moreover, we observed the appearance of a second nucleolar form (Nucleostemin) upon speciation of the deuterostomes. Since Nucleostemin is involved in cell-cycle regulation in stem cells, we can hypothesize a direct mechanism of rRNA maturation in those highly specialized animal cells. We propose the following scenario for the evolution of the YRG family. First, a cytosolic founding member was duplicated upon the formation of a proto-nucleus, allowing the rRNA maturation pathway to be maintained (Figure 6). The second step included the engulfment of mitochondria and chloroplasts containing specific YRG forms involved in rRNA maturation in these compartments. The final step(s) involved the evolution of the cytosolic and nuclear members upon the specialization of the eukaryotic cell (nucleolus etc). This scenario accords with the work of Mans et al. [47], which showed by comparative genomics that a large set of proteins was involved in the formation and structure of the nuclear envelope and the pore complex: the nucleus evolved from a primordial prekaryote compartment and a primitive nuclear pore complex dependent on Ran and on Nug1p/Nug2p, a nucleolar YRG member.
Interestingly, hLsg1 is the only member of this family that shows a dual localization (cytosol/endoplasmic reticulum and Cajal Bodies). The cytosol contains huge numbers of ribosomes freely diffusing or bound to the endoplasmic reticulum, and is the main transit pathway for rRNA en route to the mitochondria or the chloroplast. Cajal Bodies are spherical nuclear bodies containing a variety of components including nucleolar proteins, snRNPs and SMN. They are dynamic structures functionally linked to the nucleolus, presumably involved in RNP maturation and related to gene expression [43,44]. Consistent with these data, one could hypothesize that hLsg1 is a regulator of the rRNA pathway that can relocate to Cajal Bodies and interact with specific factors such as nucleolar proteins. The observation that Leptomycin B treatment leads to accumulation of hLsg1 in the nucleus clearly indicates shuttling via a CRM1-dependent export pathway. We hypothesized that hLsg1 relocalizes from the cytosol to the nucleus in response to internal (e.g. cell cycle) or external (e.g. growth factor) stimuli. In this way, hLsg1 would act on the control of rRNA biosynthesis at its source: the nucleolus. In the future, these hypotheses will be tested for hLsg1 and for the other YRG family members to elucidate their role in rRNA biosynthesis and maturation.