Magnesium is one of the most versatile metal cofactors in cellular biochemistry, serving both intra and extracellular, catalytic and/or structural roles . It is used to stabilize a variety of protein structures; e.g., the interface of the ribonucleotide reductase subunits . It is also used to stabilize nucleic acids by alleviating electrostatic repulsion between negatively charged phosphates. Furthermore, Mg2+, together with Ca2+, stabilize biological membranes by charge neutralization after binding to the carboxylated and phosphorylated headgroups of lipids. It also activates enzymes that regulate the biochemistry of nucleic acids such as restriction nucleases, ligases, and topoisomerases, and is essential for the fidelity of DNA replication . Divalent Mg2+ is a "hard" ion and prefers "hard" ligands of low polarizability like oxygen. It tends to bind directly to the amino acid residues, primarily to the Asp/Glu carboxylic side chains, followed by the Asn/Gln side chains or peptide backbone carbonyl groups . The rest of the metal coordination sphere, which is usually octahedral, is complemented by water ligand(s).
Unlike Zn2+ and Ca2+-binding sites, only a few, relatively short, sequence motifs have been discovered for Mg2+ proteins with close sequence homology. These include the -NADFDGD- motif, found in different RNA polymerases, DNA Pol I and HIV reverse transcriptase, and the -YXDD- or -LXDD- motifs in reverse transcriptase and telomerase, where residues in bold are the Mg2+ ligands . As the known Mg2+ sequence motifs are short, they could easily be found in other nonMg2+-proteins and would not be expected to be Mg2+-specific. Interestingly, some homology in the 3D structure of the Mg2+-binding sites has been observed for certain classes of enzymes such as restriction enzymes, bacterial and viral RNase H domains, and viral integrases . However, systematic studies of the structural preference/conservation of Mg2+-binding sites in nonhomologous proteins have not been reported; hence, no structural motifs of the Mg2+-binding sites have been extracted.
The aims in this work are to address the following intriguing questions: (1) Do Mg2+-binding sites exhibit any preference for certain local/secondary structures? If so, which types of local/secondary structures are favored and which are disfavored? (2) Even when the Mg2+-proteins share no significant sequence homology, do they share a similar Mg2+-binding site structure? (3) If structural motifs of the Mg2+-binding sites exist, do they map to specific protein functions? (4) Are the structural motifs Mg2+-specific? In particular, are they found in proteins that do not bind metal ions or in proteins that bind Ca2+, which like Mg2+, is also a divalent "hard" ion, binding preferentially to "hard" oxygen-containing ligands?
To address the aforementioned questions, we have developed a general strategy for discovering 3D motifs that are hidden in the local structure of the active/binding site, based on the fact that the local structure is generally more evolutionary conserved than the amino acid sequence . The 3D motifs of the metal-binding sites were obtained by encoding the 3D representation based on Cartesian coordinates into a 1D representation based on a 16-letter structural alphabet [6,7]. The structural alphabet represents recurring short structural prototypes and has been used to (i) compare/analyze 3D structures [8-10], (ii) predict protein 3D structures from amino acid sequences [6,11], (iii) reconstruct the protein backbone , and (iv) model loops . However, it has not been used to discover structural motifs of metal/ligand-binding sites in proteins. First, the structural-alphabet based motif discovery approach was validated by using it to "rediscover" the structural motif of Cys4 Zn-finger domains, which are known to adopt a specific structure. Next, it was used to discover structural motifs of Mg2+-binding sites in a set of nonredundant Mg2+-proteins sharing 2+-binding sites, 4 Mg2+-structural motifs, and important relationships between these motifs and other features of the proteins. The specificity of the structural motifs discovered for certain Mg2+-binding sites was assessed by determining their occurrence in a set of nonredundant non-metal containing protein structures and in a set of nonredundant Ca2+-bound protein structures.