Nucleic Acids Research, 2002, Vol. 30, No. 11 2478-2483
© 2002 Oxford University Press
Fast algorithms for large-scale genome alignment and comparison
1Department of Computer Science, Loyola College in Maryland, Baltimore, MD 21210, USA, 2Celera Genomics, 45 West Gude Drive, Rockville, MD 20850, USA, 3The Institute for Genomic Research, Rockville, MD 20850, USA and 4Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
We describe a suffix-tree algorithm that can align the entire genome sequences of eukaryotic and prokaryotic organisms with minimal use of computer time and memory. The new system, MUMmer 2, runs three times faster while using one-third as much memory as the original MUMmer system. It has been used successfully to align the entire human and mouse genomes to each other, and to align numerous smaller eukaryotic and prokaryotic genomes. A new module permits the alignment of multiple DNA sequence fragments, which has proven valuable in the comparison of incomplete genome sequences. We also describe a method to align more distantly related genomes by detecting protein sequence homology. This extension to MUMmer aligns two genomes after translating the sequence in all six reading frames, extracts all matching protein sequences and then clusters together matches. This method has been applied to both incomplete and complete genome sequences in order to detect regions of conserved synteny, in which multiple proteins from one organism are found in the same order and orientation in another. The system code is being made freely available by the authors.
* To whom correspondence should be addressed at: The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA. Tel: +1 301 315 2537; Fax: +1 301 838 0208; Email: salzberg{at}tigr.org
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G. S. A. Myers, S. A. Mathews, M. Eppinger, C. Mitchell, K. K. O'Brien, O. R. White, F. Benahmed, R. C. Brunham, T. D. Read, J. Ravel, et al. Evidence that Human Chlamydia pneumoniae Was Zoonotically Acquired J. Bacteriol., December 1, 2009; 191(23): 7225 - 7233. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Davidsen, E. Beck, A. Ganapathy, R. Montgomery, N. Zafar, Q. Yang, R. Madupu, P. Goetz, K. Galinsky, O. White, et al. The comprehensive microbial resource Nucleic Acids Res., November 5, 2009; (2009) gkp912v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. McCutcheon, B. R. McDonald, and N. A. Moran Convergent evolution of metabolic roles in bacterial co-symbionts of insects PNAS, September 8, 2009; 106(36): 15394 - 15399. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Rausch, S. Koren, G. Denisov, D. Weese, A.-K. Emde, A. Doring, and K. Reinert A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads Bioinformatics, May 1, 2009; 25(9): 1118 - 1124. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-M. Oh, S. J. Giovannoni, S. Ferriera, J. Johnson, and J.-C. Cho Complete Genome Sequence of Erythrobacter litoralis HTCC2594 J. Bacteriol., April 1, 2009; 191(7): 2419 - 2420. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. C. Stewart, B. Osborne, and T. D. Read DIYA: a bacterial annotation pipeline for any genomics lab Bioinformatics, April 1, 2009; 25(7): 962 - 963. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Malhis, Y. S. N. Butterfield, M. Ester, and S. J. M. Jones Slider--maximum use of probability information for alignment of short sequence reads and SNP detection Bioinformatics, January 1, 2009; 25(1): 6 - 13. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-J. Kwon, S.-H. Cho, T.-E. Kim, Y.-J. Won, J. Jeong, S. C. Park, J.-H. Kim, H.-S. Yoo, Y.-H. Park, and S.-J. Kim Characterization of a T7-Like Lytic Bacteriophage ({phi}SG-JL2) of Salmonella enterica Serovar Gallinarum Biovar Gallinarum Appl. Envir. Microbiol., November 15, 2008; 74(22): 6970 - 6979. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Naito, H. Hirakawa, A. Yamashita, N. Ohara, M. Shoji, H. Yukitake, K. Nakayama, H. Toh, F. Yoshimura, S. Kuhara, et al. Determination of the Genome Sequence of Porphyromonas gingivalis Strain ATCC 33277 and Genomic Comparison with Strain W83 Revealed Extensive Genome Rearrangements in P. gingivalis DNA Res, August 1, 2008; 15(4): 215 - 225. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Duret, J. Cohen, C. Jubin, P. Dessen, J.-F. Gout, S. Mousset, J.-M. Aury, O. Jaillon, B. Noel, O. Arnaiz, et al. Analysis of sequence variability in the macronuclear DNA of Paramecium tetraurelia: A somatic view of the germline Genome Res., April 1, 2008; 18(4): 585 - 596. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. D. Manning, A. S. Motiwala, A. C. Springman, W. Qi, D. W. Lacher, L. M. Ouellette, J. M. Mladonicky, P. Somsel, J. T. Rudrik, S. E. Dietrich, et al. From the Cover: Variation in virulence among clades of Escherichia coli O157:H7 associated with disease outbreaks PNAS, March 25, 2008; 105(12): 4868 - 4873. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. White, M. Roberts, J. A. Yorke, and M. Pop Figaro: a novel statistical method for vector sequence removal Bioinformatics, February 15, 2008; 24(4): 462 - 467. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. V. Zimin, D. R. Smith, G. Sutton, and J. A. Yorke Assembly reconciliation Bioinformatics, January 1, 2008; 24(1): 42 - 45. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Bergman and H. Quesneville Discovering and detecting transposable elements in genome sequences Brief Bioinform, November 1, 2007; 8(6): 382 - 392. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Mandel, B. M. Barker, S. Kroken, S. D. Rounsley, and M. J. Orbach Genomic and Population Analyses of the Mating Type Loci in Coccidioides Species Reveal Evidence for Sexual Reproduction and Gene Acquisition Eukaryot. Cell, July 1, 2007; 6(7): 1189 - 1199. [Abstract] [Full Text] [PDF] |
||||
![]() |
N.-H. Cho, H.-R. Kim, J.-H. Lee, S.-Y. Kim, J. Kim, S. Cha, S.-Y. Kim, A. C. Darby, H.-H. Fuxelius, J. Yin, et al. The Orientia tsutsugamushi genome reveals massive proliferation of conjugative type IV secretion system and host cell interaction genes PNAS, May 8, 2007; 104(19): 7981 - 7986. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Yukawa, C. A. Omumasaba, H. Nonaka, P. Kos, N. Okai, N. Suzuki, M. Suda, Y. Tsuge, J. Watanabe, Y. Ikeda, et al. Comparative analysis of the Corynebacterium glutamicum group and complete genome sequence of strain R Microbiology, April 1, 2007; 153(4): 1042 - 1058. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Kumar and A. Filipski Multiple sequence alignment: In pursuit of homologous DNA positions Genome Res., February 1, 2007; 17(2): 127 - 135. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Horn, Z. Arziman, J. Berger, and M. Boutros GenomeRNAi: a database for cell-based RNAi phenotypes Nucleic Acids Res., January 12, 2007; 35(suppl_1): D492 - D497. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Hain, C. Steinweg, C. T. Kuenne, A. Billion, R. Ghai, S. S. Chatterjee, E. Domann, U. Karst, A. Goesmann, T. Bekel, et al. Whole-Genome Sequence of Listeria welshimeri Reveals Common Steps in Genome Reduction with Listeria innocua as Compared to Listeria monocytogenes J. Bacteriol., November 1, 2006; 188(21): 7405 - 7415. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. B. Cannon, L. Sterck, S. Rombauts, S. Sato, F. Cheung, J. Gouzy, X. Wang, J. Mudge, J. Vasdewani, T. Schiex, et al. Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes PNAS, October 3, 2006; 103(40): 14959 - 14964. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Palenik, Q. Ren, C. L. Dupont, G. S. Myers, J. F. Heidelberg, J. H. Badger, R. Madupu, W. C. Nelson, L. M. Brinkac, R. J. Dodson, et al. Genome sequence of Synechococcus CC9311: Insights into adaptation to a coastal environment PNAS, September 5, 2006; 103(36): 13555 - 13559. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. S.A. Myers, D. A. Rasko, J. K. Cheung, J. Ravel, R. Seshadri, R. T. DeBoy, Q. Ren, J. Varga, M. M. Awad, L. M. Brinkac, et al. Skewed genomic variability in strains of the toxigenic bacterial pathogen, Clostridium perfringens Genome Res., August 1, 2006; 16(8): 1031 - 1040. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Szafranski, N. Jahn, and M. Platzer tuple_plot: Fast pairwise nucleotide sequence comparison with noise suppression Bioinformatics, August 1, 2006; 22(15): 1917 - 1918. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. N. Reva, C. Weinel, M. Weinel, K. Bohm, D. Stjepandic, J. D. Hoheisel, and B. Tummler Functional Genomics of Stress Response in Pseudomonas putida KT2440 J. Bacteriol., June 1, 2006; 188(11): 4079 - 4092. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. S. Han, G. Xie, J. F. Challacombe, M. R. Altherr, S. S. Bhotika, D. Bruce, C. S. Campbell, M. L. Campbell, J. Chen, O. Chertkov, et al. Pathogenomic Sequence Analysis of Bacillus cereus and Bacillus thuringiensis Isolates Closely Related to Bacillus anthracis J. Bacteriol., May 1, 2006; 188(9): 3382 - 3390. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Peters, J. C. van Haarst, T. P. Jesse, D. Woltinge, K. Jansen, T. Hesselink, M. J. van Staveren, M. H.C. Abma-Henkens, and R. M. Klein-Lankhorst TOPAAS, a Tomato and Potato Assembly Assistance System for Selection and Finishing of Bacterial Artificial Chromosomes Plant Physiology, March 1, 2006; 140(3): 805 - 817. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Zhang and W. Gish Improved spliced alignment from an information theoretic approach Bioinformatics, January 1, 2006; 22(1): 13 - 20. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Huang, D. M. Umbach, and L. Li Accurate anchoring alignment of divergent sequences Bioinformatics, January 1, 2006; 22(1): 29 - 34. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. R. Chaudhuri and M. J. Pallen xBASE, a collection of online databases for bacterial comparative genomics Nucleic Acids Res., January 1, 2006; 34(suppl_1): D335 - D337. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Tettelin, V. Masignani, M. J. Cieslewicz, C. Donati, D. Medini, N. L. Ward, S. V. Angiuoli, J. Crabtree, A. L. Jones, A. S. Durkin, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial "pan-genome" PNAS, September 27, 2005; 102(39): 13950 - 13955. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Flannick and S. Batzoglou Using multiple alignments to improve seeded local alignment algorithms Nucleic Acids Res., August 12, 2005; 33(14): 4563 - 4577. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. L. Lu, T. C. Wang, Y. C. Lin, and C. Y. Tang ROBIN: a tool for genome rearrangement of block-interchanges Bioinformatics, June 1, 2005; 21(11): 2780 - 2782. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. L. Chan, T. W. Lam, W. K. Sung, P. W. H. Wong, S. M. Yiu, and X. Fan The mutated subsequence problem and locating conserved genes Bioinformatics, May 15, 2005; 21(10): 2271 - 2278. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. J. Parham, S. J. Pollard, R. R. Chaudhuri, S. A. Beatson, M. Desvaux, M. A. Russell, J. Ruiz, A. Fivian, J. Vila, and I. R. Henderson Prevalence of Pathogenicity Island IICFT073 Genes among Extraintestinal Clinical Isolates of Escherichia coli J. Clin. Microbiol., May 1, 2005; 43(5): 2425 - 2434. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. D. Wu and C. K. Watanabe GMAP: a genomic mapping and alignment program for mRNA and EST sequences Bioinformatics, May 1, 2005; 21(9): 1859 - 1875. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Gill, D. E. Fouts, G. L. Archer, E. F. Mongodin, R. T. DeBoy, J. Ravel, I. T. Paulsen, J. F. Kolonay, L. Brinkac, M. Beanan, et al. Insights on Evolution of Virulence and Resistance from the Complete Genome Analysis of an Early Methicillin-Resistant Staphylococcus aureus Strain and a Biofilm-Producing Methicillin-Resistant Staphylococcus epidermidis Strain J. Bacteriol., April 1, 2005; 187(7): 2426 - 2438. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-H. Chiu, P. Tang, C. Chu, S. Hu, Q. Bao, J. Yu, Y.-Y. Chou, H.-S. Wang, and Y.-S. Lee The genome sequence of Salmonella enterica serovar Choleraesuis, a highly invasive and resistant zoonotic pathogen Nucleic Acids Res., March 21, 2005; 33(5): 1690 - 1698. [Abstract] [Full Text] [PDF] |
||||
![]() |
W.-G. Qiu, S. E. Schutzer, J. F. Bruno, O. Attie, Y. Xu, J. J. Dunn, C. M. Fraser, S. R. Casjens, and B. J. Luft Genetic exchange and plasmid transfers in Borrelia burgdorferi sensu stricto revealed by three-way genome comparisons and multilocus sequence typing PNAS, September 28, 2004; 101(39): 14150 - 14155. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. G. Loots and I. Ovcharenko rVISTA 2.0: evolutionary analysis of transcription factor binding sites Nucleic Acids Res., July 1, 2004; 32(suppl_2): W217 - W221. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Ovcharenko, M. A. Nobrega, G. G. Loots, and L. Stubbs ECR Browser: a tool for visualizing and accessing data from comparisons of multiple vertebrate genomes Nucleic Acids Res., July 1, 2004; 32(suppl_2): W280 - W286. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-J. Shen, H. Jiang, J.-P. Jin, Z.-B. Zhang, B. Xi, Y.-Y. He, G. Wang, C. Wang, L. Qian, X. Li, et al. Development of Genome-Wide DNA Polymorphism Database for Map-Based Cloning of Rice Genes Plant Physiology, July 1, 2004; 135(3): 1198 - 1205. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Simillion, K. Vandepoele, Y. Saeys, and Y. Van de Peer Building Genomic Profiles for Uncovering Segmental Homology in the Twilight Zone Genome Res., June 1, 2004; 14(6): 1095 - 1106. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. E. Nelson, D. E. Fouts, E. F. Mongodin, J. Ravel, R. T. DeBoy, J. F. Kolonay, D. A. Rasko, S. V. Angiuoli, S. R. Gill, I. T. Paulsen, et al. Whole genome comparisons of serotype 4b and 1/2a strains of the food-borne pathogen Listeria monocytogenes reveal new insights into the core genome components of this species Nucleic Acids Res., April 28, 2004; 32(8): 2386 - 2395. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Havlak, R. Chen, K. J. Durbin, A. Egan, Y. Ren, X.-Z. Song, G. M. Weinstock, and R. A. Gibbs The Atlas Genome Assembly System Genome Res., April 1, 2004; 14(4): 721 - 732. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. E. Abbas and S. P. Holmes Bioinformatics and Management Science: Some Common Tools and Techniques Operations Research, March 1, 2004; 52(2): 165 - 190. [Abstract] [PDF] |
||||
![]() |
H. J. Greenberg, W. E. Hart, and G. Lancia Opportunities for Combinatorial Optimization in Computational Biology INFORMS Journal on Computing, January 1, 2004; 16(3): 211 - 231. [Abstract] [PDF] |
||||
![]() |
M. Pop, D. S. Kosack, and S. L. Salzberg Hierarchical Scaffolding With Bambus Genome Res., January 1, 2004; 14(1): 149 - 159. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. R. Chaudhuri, A. M. Khan, and M. J. Pallen coliBASE: an online database for Escherichia coli, Shigella and Salmonella comparative genomics Nucleic Acids Res., January 1, 2004; 32(90001): D296 - 299. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. F. T. van Hijum, A. L. Zomer, O. P. Kuipers, and J. Kok Projector: automatic contig mapping for gap closure purposes Nucleic Acids Res., November 15, 2003; 31(22): e144 - e144. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. R. Zeigler Gene sequences useful for predicting relatedness of whole genomes in bacteria Int J Syst Evol Microbiol, November 1, 2003; 53(6): 1893 - 1900. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schluter, H. Heuer, R. Szczepanowski, L. J. Forney, C. M. Thomas, A. Puhler, and E. M. Top The 64 508 bp IncP-1{beta} antibiotic multiresistance plasmid pB10 isolated from a waste-water treatment plant provides evidence for recombination between members of different branches of the IncP-1{beta} group Microbiology, November 1, 2003; 149(11): 3139 - 3153. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Herron-Olson, J. Freeman, Q. Zhang, E. F. Retzel, and V. Kapur MGView: an alignment and visualization tool to enhance gap closure of microbial genomes Nucleic Acids Res., September 1, 2003; 31(17): e106 - e106. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Emrich, M. Lowe, and A. L. Delcher PROBEmer: a web-based software tool for selecting optimal DNA oligos Nucleic Acids Res., July 1, 2003; 31(13): 3746 - 3750. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Volfovsky, B. J. Haas, and S. L. Salzberg Computational Discovery of Internal Micro-Exons Genome Res., June 1, 2003; 13(6): 1216 - 1221. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Musto, H. Romero, and A. Zavala Translational selection is operative for synonymous codon usage in Clostridium perfringens and Clostridium acetobutylicum Microbiology, April 1, 2003; 149(4): 855 - 863. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Brudno, C. B. Do, G. M. Cooper, M. F. Kim, E. Davydov, N. C. S. Program, E. D. Green, A. Sidow, and S. Batzoglou LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNA Genome Res., April 1, 2003; 13(4): 721 - 731. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Bray, I. Dubchak, and L. Pachter AVID: A Global Alignment Program Genome Res., January 1, 2003; 13(1): 97 - 102. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. C. Rouchka, W. Gish, and D. J. States Comparison of whole genome assemblies of the human genome Nucleic Acids Res., November 15, 2002; 30(22): 5004 - 5014. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lin, J. Qian, D. Greenbaum, P. Bertone, R. Das, N. Echols, A. Senes, B. Stenger, and M. Gerstein GeneCensus: genome comparisons in terms of metabolic pathway activity and protein family sharing Nucleic Acids Res., October 15, 2002; 30(20): 4574 - 4582. [Abstract] [Full Text] [PDF] |
||||














