Published online 16 December 2005
Article |
Automatic assessment of alignment quality
Center for Genomics and Bioinformatics, Karolinska Institutet S-17177 Stockholm, Sweden
*To whom correspondence should be addressed. Tel: +46 8 5248 6372; Fax: +46 8 337983; Email: timo.lassmann{at}cgb.ki.se
Received September 8, 2005. Revised October 21, 2005. Accepted November 30, 2005.
Multiple sequence alignments play a central role in the annotation of novel genomes. Given the biological and computational complexity of this task, the automatic generation of high-quality alignments remains challenging. Since multiple alignments are usually employed at the very start of data analysis pipelines, it is crucial to ensure high alignment quality. We describe a simple, yet elegant, solution to assess the biological accuracy of alignments automatically. Our approach is based on the comparison of several alignments of the same sequences. We introduce two functions to compare alignments: the average overlap score and the multiple overlap score. The former identifies difficult alignment cases by expressing the similarity among several alignments, while the latter estimates the biological correctness of individual alignments. We implemented both functions in the MUMSA program and demonstrate the overall robustness and accuracy of both functions on three large benchmark sets.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A.-C. M. Toes, M. H. Daleke, J. G. Kuenen, and G. Muyzer Expression of copA and cusA in Shewanella during copper stress Microbiology, September 1, 2008; 154(9): 2709 - 2718. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. G. Hall How Well Does the HoT Score Reflect Sequence Alignment Accuracy? Mol. Biol. Evol., August 1, 2008; 25(8): 1576 - 1580. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. G. Hall Simulating DNA Coding Sequence Evolution with EvolveAGene 3 Mol. Biol. Evol., April 1, 2008; 25(4): 688 - 695. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lunter, A. Rocco, N. Mimouni, A. Heger, A. Caldeira, and J. Hein Uncertainty in homology inferences: Assessing and improving genomic sequence alignment Genome Res., February 1, 2008; 18(2): 298 - 309. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Carroll, W. Beckstead, T. O'Connor, M. Ebbert, M. Clement, Q. Snell, and D. McClellan DNA reference alignment benchmarks based on tertiary structure of encoded proteins Bioinformatics, October 1, 2007; 23(19): 2648 - 2649. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Moretti, F. Armougom, I. M. Wallace, D. G. Higgins, C. V. Jongeneel, and C. Notredame The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods Nucleic Acids Res., July 13, 2007; 35(suppl_2): W645 - W648. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Lassmann and E. L. L. Sonnhammer Kalign, Kalignvu and Mumsa: web servers for multiple sequence alignment. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W596 - W599. [Abstract] [Full Text] [PDF] |
||||




