Comparative genomics is a powerful method for identifying the potential functions of previously uncharacterized genes, allowing their distribution among the kingdoms of life to be characterized, and the changes in sequence and regulation underpinning their conserved or divergent functions to be tracked [1]. Comparative genomics has been enormously facilitated by progress in bioinformatics tools, comprising the enormous amount of information available from databases concerning protein localization [2,3], viability [4,5], protein expression [6], genetic interactions [7] and protein-protein interactions [8]. These resources are usually focused on one particular organism (S. cerevisiae, C. elegans, D. melanogaster or B. subtilis) and are therefore mainly used by the small part of the scientific community working with this organism and able to handle the outcome and limitations. Attempts have been made to correlate large datasets across species, for example in the case of protein-protein interactions [9]. These cross-correlation analyses are based on the presumption that sequence and structural similarities between gene products can be used to assess functional similarities [10,11] and could in principle be extended to protein localization, viability or partners.
Genomics should be particularly powerful in the case of GTP binding proteins (or GTPases), which despite extraordinary functional diversity are all believed to have evolved from a single common ancestor [12]. As a result, all known GTPases have a conserved switch mechanism of action, core structure and sequence motifs. These proteins are found in all domains of life and are involved in such essential processes as vesicular trafficking, protein translation, intracellular signal transduction and cell cycle progression [12-14]. GTP binding proteins are often described as molecular switch proteins because of their particular mode of action. Binding and hydrolysis of GTP results in conformational changes in the so-called switch regions of the protein, which define the active GTP- and the inactive GDP-bound forms; these are used, for instance, for regulating receptor activation and cargo recruitment to membranes [12].
We have used comparative genomics to identify and characterize the human homologue of the yeast protein Lsg1. Here, we describe a novel family of GTP binding proteins, which we have named YRG (YlqF Related GTPases). Members of this family contain a central GTPase domain showing a unique circular permutation of the known G motifs of the GTP binding proteins. A phylogenetic analysis was used for cross-species comparisons, focusing on sub-cellular localization, cell viability and the known functions of each subfamily member. This analysis showed that YRG family members are essential, have increased in eukaryotes as cell compartmentalizationhas evolved, and show functional conservation in relation to rRNA maturation.