Nucleic Acids Research Advance Access originally published online on March 25, 2009
Nucleic Acids Research 2009 37(10):3276-3287; doi:10.1093/nar/gkp120
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2009, Vol. 37, No. 10 3276-3287
© 2009 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Computational Biology |
Prediction of novel microRNA genes in cancer-associated genomic regions—a combined computational and experimental approach
1Institute of Molecular Biology and Biotechnology-FORTH, Heraklion, 2Department of Biology, University of Crete, Heraklion, 3Institute of Computer Science-FORTH, Heraklion, 4Department of Computer Science, University of Crete, Heraklion, Crete and 5Institute of Oncology, BSRC Alexander Fleming, Athens, Greece
*To whom correspondence should be addressed. Tel: +30 2810 391 139; Fax: +30 2810 391 101; Email: poirazi{at}imbb.forth.gr
Received November 20, 2008. Revised February 10, 2009. Accepted February 11, 2009.
The majority of existing computational tools rely on sequence homology and/or structural similarity to identify novel microRNA (miRNA) genes. Recently supervised algorithms are utilized to address this problem, taking into account sequence, structure and comparative genomics information. In most of these studies miRNA gene predictions are rarely supported by experimental evidence and prediction accuracy remains uncertain. In this work we present a new computational tool (SSCprofiler) utilizing a probabilistic method based on Profile Hidden Markov Models to predict novel miRNA precursors. Via the simultaneous integration of biological features such as sequence, structure and conservation, SSCprofiler achieves a performance accuracy of 88.95% sensitivity and 84.16% specificity on a large set of human miRNA genes. The trained classifier is used to identify novel miRNA gene candidates located within cancer-associated genomic regions and rank the resulting predictions using expression information from a full genome tiling array. Finally, four of the top scoring predictions are verified experimentally using northern blot analysis. Our work combines both analytical and experimental techniques to show that SSCprofiler is a highly accurate tool which can be used to identify novel miRNA gene candidates in the human genome. SSCprofiler is freely available as a web service at http://www.imbb.forth.gr/SSCprofiler.html.