Skip Navigation

Nucleic Acids Research 2005 33(2):511-518; doi:10.1093/nar/gki198
This Article
Right arrow Full Text Freely available
Right arrow Print PDF (119K) Freely available
Right arrow Screen PDF (141K) Freely available
Right arrow Supplementary Material
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (184)
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Katoh, K.
Right arrow Articles by Miyata, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Katoh, K.
Right arrow Articles by Miyata, T.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published online 20 January 2005

© 2005, the authors Nucleic Acids Research, Vol. 33 No. 2 © Oxford University Press 2005; all rights reserved
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use permissions, please contact journals.permissions{at}oupjournals.org.


Article

MAFFT version 5: improvement in accuracy of multiple sequence alignment

Kazutaka Katoh1,*, Kei-ichi Kuma1, Hiroyuki Toh1 and Takashi Miyata2,3,4

1 Bioinformatics Center, Institute for Chemical Research, Kyoto University Uji, Kyoto 611-0011, Japan 2 Biohistory Research Hall Takatsuki, Osaka 569-1125, Japan 3 Department of Electrical Engineering and Bioscience, Science and Engineering, Waseda University Tokyo 169-8555, Japan 4 Department of Biophysics, Graduate School of Science, Kyoto University Kyoto 606-8502, Japan

*To whom correspondence should be addressed. Tel: +81 774 38 3119; Fax: +81 774 38 3059; Email: kkatoh{at}kuicr.kyoto-u.ac.jp

Received October 14, 2004. Revised November 16, 2004. Accepted December 29, 2004.

The accuracy of multiple sequence alignment program MAFFT has been improved. The new version (5.3) of MAFFT offers new iterative refinement options, H-INS-i, F-INS-i and G-INS-i, in which pairwise alignment information are incorporated into objective function. These new options of MAFFT showed higher accuracy than currently available methods including TCoffee version 2 and CLUSTAL W in benchmark tests consisting of alignments of >50 sequences. Like the previously available options, the new options of MAFFT can handle hundreds of sequences on a standard desktop computer. We also examined the effect of the number of homologues included in an alignment. For a multiple alignment consisting of ~8 sequences with low similarity, the accuracy was improved (2–10 percentage points) when the sequences were aligned together with dozens of their close homologues (E-value < 10–5–10–20) collected from a database. Such improvement was generally observed for most methods, but remarkably large for the new options of MAFFT proposed here. Thus, we made a Ruby script, mafftE.rb, which aligns the input sequences together with their close homologues collected from SwissProt using NCBI-BLAST.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
B. G. Hall
How Well Does the HoT Score Reflect Sequence Alignment Accuracy?
Mol. Biol. Evol., August 1, 2008; 25(8): 1576 - 1580.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. L. Knies, K. K. Dang, T. J. Vision, N. G. Hoffman, R. Swanstrom, and C. L. Burch
Compensatory Evolution in RNA Secondary Structures Increases Substitution Rate Variation among Sites
Mol. Biol. Evol., August 1, 2008; 25(8): 1778 - 1787.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
K. Katoh and H. Toh
Recent developments in the MAFFT multiple sequence alignment program
Brief Bioinform, July 1, 2008; 9(4): 286 - 298.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H.-M. Bourbon
Comparative genomics supports a deep evolutionary origin for the large, four-module transcriptional mediator complex
Nucleic Acids Res., July 1, 2008; 36(12): 3993 - 4008.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Moretti, A. Wilm, D. G. Higgins, I. Xenarios, and C. Notredame
R-Coffee: a web server for accurately aligning noncoding RNA sequences
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W10 - W13.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Pei, M. Tang, and N. V. Grishin
PROMALS3D web server for accurate multiple protein sequence and structure alignments
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W30 - W34.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
A. Loytynoja and N. Goldman
Phylogeny-Aware Gap Placement Prevents Errors in Sequence Alignment and Evolutionary Analysis
Science, June 20, 2008; 320(5883): 1632 - 1635.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Kuraku, Y. Takio, K. Tamura, H. Aono, A. Meyer, and S. Kuratani
Noncanonical role of Hox14 revealed by its expression patterns in lamprey and shark
PNAS, May 6, 2008; 105(18): 6679 - 6683.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
M. Quan, M. van Vuuren, P. G. Howell, D. Groenewald, and A. J. Guthrie
Molecular epidemiology of the African horse sickness virus S10 gene
J. Gen. Virol., May 1, 2008; 89(5): 1159 - 1168.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Wilm, D. G. Higgins, and C. Notredame
R-Coffee: a method for multiple alignment of non-coding RNA
Nucleic Acids Res., May 1, 2008; 36(9): e52 - e52.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Balke, J. Gomez-Zurita, I. Ribera, A. Viloria, A. Zillikens, J. Steiner, M. Garcia, L. Hendrich, and A. P. Vogler
Ancient associations of aquatic beetles and tank bromeliads in the Neotropical forest canopy
PNAS, April 29, 2008; 105(17): 6356 - 6361.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Pei, B.-H. Kim, and N. V. Grishin
PROMALS3D: a tool for multiple protein sequence and structure alignments
Nucleic Acids Res., April 1, 2008; 36(7): 2295 - 2300.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
K. M. Bushell, C. Sollner, B. Schuster-Boeckler, A. Bateman, and G. J. Wright
Large-scale screening for novel low-affinity extracellular protein interactions
Genome Res., April 1, 2008; 18(4): 622 - 630.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. D. Moore and R. G. Allaby
TreeMos: a high-throughput phylogenomic approach to find and visualize phylogenetic mosaicism
Bioinformatics, March 1, 2008; 24(5): 717 - 718.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W. Pirovano, K. A. Feenstra, and J. Heringa
PRALINETM: a strategy for improved multiple alignment of transmembrane proteins
Bioinformatics, February 15, 2008; 24(4): 492 - 497.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Michael, G. Trave, C. Ramu, C. Chica, and T. J. Gibson
Discovery of candidate KEN-box motifs using Cell Cycle keyword enrichment combined with native disorder prediction and motif conservation
Bioinformatics, February 15, 2008; 24(4): 453 - 457.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. G. Conte, S. Gaillard, N. Lanau, M. Rouard, and C. Perin
GreenPhylDB: a database for plant comparative genomics
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D991 - D998.
[Abstract] [Full Text] [PDF]


Home page
J. Lipid Res.Home page
K. Hashimoto, A. C. Yoshizawa, S. Okuda, K. Kuma, S. Goto, and M. Kanehisa
The repertoire of desaturases and elongases reveals fatty acid variations in 56 eukaryotic genomes
J. Lipid Res., January 1, 2008; 49(1): 183 - 191.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
C. Casola, D. Hucks, and C. Feschotte
Convergent Domestication of pogo-like Transposases into Centromere-Binding Proteins in Fission Yeast and Mammals
Mol. Biol. Evol., January 1, 2008; 25(1): 29 - 41.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Wang, R. R. Gutell, and D. P. Miranker
Biclustering as a method for RNA local multiple sequence alignment
Bioinformatics, December 15, 2007; 23(24): 3289 - 3296.
[Abstract] [Full Text] [PDF]


Home page
J. Lipid Res.Home page
P. A. Watkins, D. Maiguel, Z. Jia, and J. Pevsner
Evidence for 26 distinct acyl-coenzyme A synthetase genes in the human genome
J. Lipid Res., December 1, 2007; 48(12): 2736 - 2750.
[Abstract] [Full Text] [PDF]


Home page
Ann. N. Y. Acad. Sci.Home page
R. A. CRAIG and L. LIAO
Improving Protein Protein Interaction Prediction Based on Phylogenetic Information Using a Least-Squares Support Vector Machine
Ann. N.Y. Acad. Sci., December 1, 2007; 1115(1): 154 - 167.
[Abstract] [Full Text] [PDF]


Home page
J Exp BotHome page
S. Schmidt von Braun, A. Sabetti, P. J. Hanic-Joyce, J. Gu, E. Schleiff, and P. B. M. Joyce
Dual targeting of the tRNA nucleotidyltransferase in plants: not just the signal
J. Exp. Bot., December 1, 2007; 58(15-16): 4083 - 4093.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
T. Schlegel, O. Mirus, A. von Haeseler, and E. Schleiff
The Tetratricopeptide Repeats of Receptors Involved in Protein Translocation across Membranes
Mol. Biol. Evol., December 1, 2007; 24(12): 2763 - 2774.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
T. Golubchik, M. J. Wise, S. Easteal, and L. S. Jermiin
Mind the Gaps: Evidence of Bias in Estimates of Multiple Sequence Alignments
Mol. Biol. Evol., November 1, 2007; 24(11): 2433 - 2442.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
K. Hirano, M. Nakajima, K. Asano, T. Nishiyama, H. Sakakibara, M. Kojima, E. Katoh, H. Xiang, T. Tanahashi, M. Hasebe, et al.
The GID1-Mediated Gibberellin Perception Mechanism Is Conserved in the Lycophyte Selaginella moellendorffii but Not in the Bryophyte Physcomitrella patens
PLANT CELL, October 1, 2007; 19(10): 3058 - 3079.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Labarga, F. Valentin, M. Anderson, and R. Lopez
Web Services at the European Bioinformatics Institute
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W6 - W11.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Pei, B.-H. Kim, M. Tang, and N. V. Grishin
PROMALS web server for accurate multiple protein sequence alignments
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W649 - W652.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Chikkagoudar, U. Roshan, and D. Livesay
eProbalign: generation and manipulation of multiple sequence alignments using partition function posterior probabilities
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W675 - W677.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Moretti, F. Armougom, I. M. Wallace, D. G. Higgins, C. V. Jongeneel, and C. Notredame
The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W645 - W648.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Pagni, V. Ioannidis, L. Cerutti, M. Zahn-Zabal, C. V. Jongeneel, J. Hau, O. Martin, D. Kuznetsov, and L. Falquet
MyHits: improvements to an interactive resource for analyzing protein sequences
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W433 - W437.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. J. Wheeler and J. D. Kececioglu
Multiple alignment by aligning alignments
Bioinformatics, July 1, 2007; 23(13): i559 - i568.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
A. J. Gentles, M. J. Wakefield, O. Kohany, W. Gu, M. A. Batzer, D. D. Pollock, and J. Jurka
Evolutionary dynamics of transposable elements in the short-tailed opossum Monodelphis domestica
Genome Res., July 1, 2007; 17(7): 992 - 1004.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
J. Sugahara, N. Yachie, K. Arakawa, and M. Tomita
In silico screening of archaeal tRNA-encoding genes having multiple introns with bulge-helix-bulge splicing motifs
RNA, May 1, 2007; 13(5): 671 - 681.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Van Vooren, B. Thienpont, B. Menten, F. Speleman, B. D. Moor, J. Vermeesch, and Y. Moreau
Mapping biomedical concepts onto the human genome by mining literature on chromosomal aberrations
Nucleic Acids Res., April 3, 2007; 35(8): 2533 - 2543.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
S. J. Spatz, L. Petherbridge, Y. Zhao, and V. Nair
Comparative full-length sequence analysis of oncogenic and vaccine (Rispens) strains of Marek's disease virus
J. Gen. Virol., April 1, 2007; 88(4): 1080 - 1096.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. Babcock, S. Yatsenko, P. Stankiewicz, J. R. Lupski, and B. E. Morrow
AT-rich repeats associated with chromosome 22q11.2 rearrangement disorders shape human genome architecture on Yq12
Genome Res., April 1, 2007; 17(4): 451 - 460.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Pei and N. V. Grishin
PROMALS: towards accurate multiple sequence alignments of distantly related proteins
Bioinformatics, April 1, 2007; 23(7): 802 - 808.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Hanekamp, U. Bohnebeck, B. Beszteri, and K. Valentin
PhyloGena a user-friendly system for automated phylogenetic annotation of unknown sequences
Bioinformatics, April 1, 2007; 23(7): 793 - 801.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
S. Richardt, D. Lang, R. Reski, W. Frank, and S. A. Rensing
PlanTAPDB, a Phylogeny-Based Resource of Plant Transcription-Associated Proteins
Plant Physiology, April 1, 2007; 143(4): 1452 - 1466.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Kim and S. Sinha
Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment
Bioinformatics, February 1, 2007; 23(3): 289 - 297.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Katoh and H. Toh
PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences
Bioinformatics, February 1, 2007; 23(3): 372 - 374.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
U. Roshan and D. R. Livesay
Probalign: multiple sequence alignment using partition function posterior probabilities
Bioinformatics, November 15, 2006; 22(22): 2715 - 2721.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
L. N. Kinch and N. V. Grishin
Longin-like folds identified in CHiPS and DUF254 proteins: Vesicle trafficking complexes conserved in eukaryotic evolution.
Protein Sci., November 1, 2006; 15(11): 2669 - 2674.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. Sato, Y. Yamanishi, K. Horimoto, M. Kanehisa, and H. Toh
Partial correlation coefficient between distance matrices as a new indicator of protein-protein interactions
Bioinformatics, October 15, 2006; 22(20): 2488 - 2492.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
K. K. Kojima, K.-i. Kuma, H. Toh, and H. Fujiwara
Identification of rDNA-Specific Non-LTR Retrotransposons in Cnidaria
Mol. Biol. Evol., October 1, 2006; 23(10): 1984 - 1993.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Pei and N. V. Grishin
MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information
Nucleic Acids Res., September 11, 2006; 34(16): 4364 - 4374.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
N. Garnier, A. Friedrich, R. Bolze, E. Bettler, L. Moulinier, C. Geourjon, J. D. Thompson, G. Deleage, and O. Poch
MAGOS: multiple alignment and modelling server
Bioinformatics, September 1, 2006; 22(17): 2164 - 2165.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
K. Kongsuwan, P. Josh, M. J. Picault, G. Wijffels, and B. Dalrymple
The Plasmid RK2 Replication Initiator Protein (TrfA) Binds to the Sliding Clamp {beta} Subunit of DNA Polymerase III: Implication for the Toxicity of a Peptide Derived from the Amino-Terminal Portion of 33-Kilodalton TrfA.
J. Bacteriol., August 1, 2006; 188(15): 5501 - 5509.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Lassmann and E. L. L. Sonnhammer
Kalign, Kalignvu and Mumsa: web servers for multiple sequence alignment.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W596 - W599.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Armougom, S. Moretti, O. Poirot, S. Audic, P. Dumas, B. Schaeli, V. Keduas, and C. Notredame
Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-Coffee.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W604 - W608.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Dalli, A. Wilm, I. Mainz, and G. Steger
STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time
Bioinformatics, July 1, 2006; 22(13): 1593 - 1599.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Flaus, D. M. A. Martin, G. J. Barton, and T. Owen-Hughes
Identification of multiple distinct Snf2 subfamilies with conserved structural motifs
Nucleic Acids Res., May 31, 2006; 34(10): 2887 - 2905.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. M. Wallace, O. O'Sullivan, D. G. Higgins, and C. Notredame
M-Coffee: combining multiple sequence alignment methods with T-Coffee
Nucleic Acids Res., March 23, 2006; 34(6): 1692 - 1699.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. C. Chiu, E. K. Lee, M. G. Egan, I. N. Sarkar, G. M. Coruzzi, and R. DeSalle
OrthologID: automation of genome-scale ortholog identification within a parsimony framework
Bioinformatics, March 15, 2006; 22(6): 699 - 707.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
A. Yu. Mitrophanov and M. Borodovsky
Statistical significance in biological sequence analysis
Brief Bioinform, March 1, 2006; 7(1): 2 - 24.



Home page
SIMHome page
G. C. Hunter, B. D. Wingfield, P. W. Crous, and M. J. Wingfield
A multi-gene phylogeny for species of Mycosphaerella occurring on Eucalyptus leaves.
Stud Mycol, January 1, 2006; 55: 147 - 161.
[Abstract] [Full Text] [PDF]


Home page
SIMHome page