© 2006 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commerical use, distribution, and reproduction in any medium, provided the original work is properly cited.
Article |
VOMBAT: prediction of transcription factor binding sites using variable order Bayesian trees
Institute of Computer Science, University Halle 06099 Halle (Saale), Germany 1 Department of Industrial Engineering, Tel-Aviv University Tel-Aviv 69978, Israel 2 Leibniz-Institute of Plant Genetics and Crop Plant Research (IPK) 06466 Gatersleben, Germany
*To whom correspondence should be addressed. Tel: ++49 39482 5755; Fax: ++49 39482 5357; Email: grosse{at}ipk-gatersleben.de
Received February 15, 2006. Revised March 24, 2006. Accepted March 24, 2006.
Variable order Markov models and variable order Bayesian trees have been proposed for the recognition of transcription factor binding sites, and it could be demonstrated that they outperform traditional models, such as position weight matrices, Markov models and Bayesian trees. We develop a web server for the recognition of DNA binding sites based on variable order Markov models and variable order Bayesian trees offering the following functionality: (i) given datasets with annotated binding sites and genomic background sequences, variable order Markov models and variable order Bayesian trees can be trained; (ii) given a set of trained models, putative DNA binding sites can be predicted in a given set of genomic sequences and (iii) given a dataset with annotated binding sites and a dataset with genomic background sequences, cross-validation experiments for different model combinations with different parameter settings can be performed. Several of the offered services are computationally demanding, such as genome-wide predictions of DNA binding sites in mammalian genomes or sets of 104-fold cross-validation experiments for different model combinations based on problem-specific data sets. In order to execute these jobs, and in order to serve multiple users at the same time, the web server is attached to a Linux cluster with 150 processors. VOMBAT is available at http://pdw-24.ipk-gatersleben.de:8080/VOMBAT/.