Nucleic Acids Research Advance Access originally published online on September 1, 2009
Nucleic Acids Research 2009 37(18):6184-6193; doi:10.1093/nar/gkp600
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2009, Vol. 37, No. 18 6184-6193
© The Author(s) 2009. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
RNA |
Accurate and efficient reconstruction of deep phylogenies from structured RNAs
1Zoologisches Forschungsmuseum Alexander Koenig, Bonn, 2Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, 3UHH Biozentrum Grindel & Zoologisches Museum, Hamburg, 4Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, D-04103 Leipzig, 5RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, Deutscher Platz 5e, D-04103 Leipzig, Germany, 6Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria and 7Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
*To whom correspondence should be addressed. Tel: +49 341 97 16686; Fax: +49 341 97 16679; Email: jana{at}bioinf.uni-leipzig.de
Correspondence may also be addressed to Roman R. Stocsits. Tel: +49 228 9122 352; Fax: +49 228 9122 295; Email: roman.stocsits{at}gmail.com
Received March 25, 2009. Revised June 29, 2009. Accepted July 1, 2009.
Ribosomal RNA (rRNA) genes are probably the most frequently used data source in phylogenetic reconstruction. Individual columns of rRNA alignments are not independent as a consequence of their highly conserved secondary structures. Unless explicitly taken into account, these correlation can distort the phylogenetic signal and/or lead to gross overestimates of tree stability. Maximum likelihood and Bayesian approaches are of course amenable to using RNA-specific substitution models that treat conserved base pairs appropriately, but require accurate secondary structure models as input. So far, however, no accurate and easy-to-use tool has been available for computing structure-aware alignments and consensus structures that can deal with the large rRNAs. The RNAsalsa approach is designed to fill this gap. Capitalizing on the improved accuracy of pairwise consensus structures and informed by a priori knowledge of group-specific structural constraints, the tool provides both alignments and consensus structures that are of sufficient accuracy for routine phylogenetic analysis based on RNA-specific substitution models. The power of the approach is demonstrated using two rRNA data sets: a mitochondrial rRNA set of 26 Mammalia, and a collection of 28S nuclear rRNAs representative of the five major echinoderm groups.