Protein–DNA binding specificity predictions with structural models
Alexandre V. Morozov*, James J. Havranek1, David Baker1 and Eric D. Siggia
Center for Studies in Physics and Biology, The Rockefeller University 1230 York Avenue, New York, NY 10021, USA 1Department of Biochemistry, University of Washington Box 357350, Seattle, WA 98195-7350, USA
*To whom correspondence should be addressed. Tel: +1 212 327 8139; Fax: +1 212 327 8544; Email: firstname.lastname@example.org
Received July 13, 2005. Revised September 13, 2005. Accepted September 13, 2005.
Protein–DNA interactions play a central role in transcriptional regulation and other biological processes. Investigating the mechanism of binding affinity and specificity in protein–DNA complexes is thus an important goal. Here we develop a simple physical energy function, which uses electrostatics, solvation, hydrogen bonds and atom-packing terms to model direct readout and sequence-specific DNA conformational energy to model indirect readout of DNA sequence by the bound protein. The predictive capability of the model is tested against another model based only on the knowledge of the consensus sequence and the number of contacts between amino acids and DNA bases. Both models are used to carry out predictions of protein–DNA binding affinities which are then compared with experimental measurements. The nearly additive nature of protein–DNA interaction energies in our model allows us to construct position-specific weight matrices by computing base pair probabilities independently for each position in the binding site. Our approach is less data intensive than knowledge-based models of protein–DNA interactions, and is not limited to any specific family of transcription factors. However, native structures of protein–DNA complexes or their close homologs are required as input to the model. Use of homology modeling can significantly increase the extent of our approach, making it a useful tool for studying regulatory pathways in many organisms and cell types.
Nucleic Acids Research 2005 33(18):5781-5798. Published by Oxford University Press. The online version of this article has been published under an open access model.