Nucleic Acids Research Advance Access published online on October 1, 2008
Nucleic Acids Research, doi:10.1093/nar/gkn589
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Computational Biology |
Predicting transcription factor specificity with all-atom models
1Department of Physics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA, 2Staudinger Weg 7, Institut für Physik, 55099 Mainz, Germany and 3Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
*To whom correspondence should be addressed. Tel: +49 6131 392 3646; Fax: +49 6131 392 5441; Email: virnau{at}uni-mainz.de
Received May 4, 2008. Revised August 29, 2008. Accepted September 2, 2008.
The binding of a transcription factor (TF) to a DNA operator site can initiate or repress the expression of a gene. Computational prediction of sites recognized by a TF has traditionally relied upon knowledge of several cognate sites, rather than an ab initio approach. Here, we examine the possibility of using structure-based energy calculations that require no knowledge of bound sites but rather start with the structure of a protein–DNA complex. We study the PurR Escherichia coli TF, and explore to which extent atomistic models of protein–DNA complexes can be used to distinguish between cognate and noncognate DNA sites. Particular emphasis is placed on systematic evaluation of this approach by comparing its performance with bioinformatic methods, by testing it against random decoys and sites of homologous TFs. We also examine a set of experimental mutations in both DNA and the protein. Using our explicit estimates of energy, we show that the specificity for PurR is dominated by direct protein–DNA interactions, and weakly influenced by bending of DNA.