Nucleic Acids Research Advance Access originally published online on January 3, 2007
Nucleic Acids Research 2007 35(3):e20; doi:10.1093/nar/gkl1062
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2007, Vol. 35, No. 3 e20
© 2006 The Author(s).
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Methods Online |
Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABAA receptor subunit genes
1 Bioinformatics Program, Boston University 24 Cummington Street, Boston, MA 02215, USA 2 Laboratory of Molecular Neurobiology, Department of Pharmacology and Experimental Therapeutics, Boston University School of Medicine 715 Albany St., Boston, MA 02118, USA 3 Program in BioMedical Neuroscience, Boston University 44 Cummington Street, Boston, MA 02215, USA 4 Biomedical Engineering, Boston University 44 Cummington Street, Boston, MA 02215, USA
*To whom correspondence should be addressed. Tel: +1 617 353 1122; Fax: +1 617 353 3333; Email: delisi{at}bu.edu
Received June 13, 2006. Revised October 18, 2006. Accepted November 20, 2006.
Understanding transcription factor (TF) mediated control of gene expression remains a major challenge at the interface of computational and experimental biology. Computational techniques predicting TF-binding site specificity are frequently unreliable. On the other hand, comprehensive experimental validation is difficult and time consuming. We introduce a simple strategy that dramatically improves robustness and accuracy of computational binding site prediction. First, we evaluate the rate of recurrence of computational TFBS predictions by commonly used sampling procedures. We find that the vast majority of results are biologically meaningless. However clustering results based on nucleotide position improves predictive power. Additionally, we find that positional clustering increases robustness to long or imperfectly selected input sequences. Positional clustering can also be used as a mechanism to integrate results from multiple sampling approaches for improvements in accuracy over each one alone. Finally, we predict and validate regulatory sequences partially responsible for transcriptional control of the mammalian type A
-aminobutyric acid receptor (GABAAR) subunit genes. Positional clustering is useful for improving computational binding site predictions, with potential application to improving our understanding of mammalian gene expression. In particular, predicted regulatory mechanisms in the mammalian GABAAR subunit gene family may open new avenues of research towards understanding this pharmacologically important neurotransmitter receptor system.
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors