Nucleic Acids Research Advance Access originally published online on January 7, 2009
Nucleic Acids Research 2009 37(5):1378-1386; doi:10.1093/nar/gkn987
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2009, Vol. 37, No. 5 1378-1386
© 2009 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Computational Biology |
RNA structure prediction from evolutionary patterns of nucleotide composition
1Centre for Integrative Bioinformatics VU (IBIVU), Vrije Universiteit, 1081 HV Amsterdam, 2Centre for Medical Systems Biology, Niels Bohrweg 2, 2300 RA Leiden, The Netherlands and 3Department of Chemistry and Biochemistry, University of Colorado, Boulder CO 80309, USA
*To whom correspondence should be addressed. Tel: +31 020 5983714; Fax: +31 020 5987653; Email: S.Smit{at}few.vu.nl
Received September 11, 2008. Revised November 21, 2008. Accepted November 21, 2008.
Structural elements in RNA molecules have a distinct nucleotide composition, which changes gradually over evolutionary time. We discovered certain features of these compositional patterns that are shared between all RNA families. Based on this information, we developed a structure prediction method that evaluates candidate structures for a set of homologous RNAs on their ability to reproduce the patterns exhibited by biological structures. The method is named SPuNC for Structure Prediction using Nucleotide Composition. In a performance test on a diverse set of RNA families we demonstrate that the SPuNC algorithm succeeds in selecting the most realistic structures in an ensemble. The average accuracy of top-scoring structures is significantly higher than the average accuracy of all ensemble members (improvements of more than 20% observed). In addition, a consensus structure that includes the most reliable base pairs gleaned from a set of top-scoring structures is generally more accurate than a consensus derived from the full structural ensemble. Our method achieves better accuracy than existing methods on several RNA families, including novel riboswitches and ribozymes. The results clearly show that nucleotide composition can be used to reveal the quality of RNA structures and thus the presented technique should be added to the set of prediction tools.