Nucleic Acids Research Advance Access originally published online on May 8, 2007
Nucleic Acids Research 2007 35(Web Server issue):W639-W644; doi:10.1093/nar/gkm275
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2007, Vol. 35, No. suppl_2 W639-W644
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Articles |
RE-MuSiC: a tool for multiple sequence alignment with regular expression constraints
1Department of Computer Science, National Tsing Hua University, Hsinchu 300, Taiwan, 2Institute of Bioinformatics, National Chiao Tung University, Hsinchu 300, Taiwan and 3Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
*To whom correspondence should be addressed. Tel: +886-3-5712121; Fax: +886-3-5729288; Email: cllu{at}mail.nctu.edu.tw
Received January 31, 2007. Revised April 6, 2007. Accepted April 11, 2007.
RE-MuSiC is a web-based multiple sequence alignment tool that can incorporate biological knowledge about structure, function, or conserved patterns regarding the sequences of interest. It accepts amino acid or nucleic acid sequences and a set of constraints as inputs. The constraints are pattern descriptions, instead of exact positions of fragments to be aligned together. The output is an alignment where for each pattern (constraint), an occurrence on each sequence can be found aligned together with those on the other sequences, in a manner that the overall alignment is optimized. Its predecessor, MuSiC, has been found useful by researchers since its release in 2004. However, it is noticed in applications that the pattern formulation adopted in MuSiC, namely, plain strings allowing mismatches, is not expressive and flexible enough. The constraint formulation adopted in RE-MuSiC is therefore enhanced to be regular expressions, which is convenient in expressing many biologically significant patterns like those collected in the PROSITE database, or structural consensuses that often involve variable ranges between conserved parts. Experiments demonstrate that RE-MuSiC can be used to help predict important residues and locate phylogenetically conserved structural elements. RE-MuSiC is available on-line at http://140.113.239.131/RE-MUSIC.