Article |
OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups
Departments of Chemistry, Biology and Genetics, Center for Bioinformatics, Penn Genomics Institute, University of Pennsylvania Philadelphia, PA 19104-6018, USA
*To whom correspondence should be addressed. Tel: +1 215 898 2118; Fax: +1 215 746 6697; Email: droos{at}sas.upenn.edu
Received August 15, 2005. Revised October 20, 2005. Accepted October 20, 2005.
The OrthoMCL database (http://orthomcl.cbil.upenn.edu) houses ortholog group predictions for 55 species, including 16 bacterial and 4 archaeal genomes representing phylogenetically diverse lineages, and most currently available complete eukaryotic genomes: 24 unikonts (12 animals, 9 fungi, microsporidium, Dictyostelium, Entamoeba), 4 plants/algae and 7 apicomplexan parasites. OrthoMCL software was used to cluster proteins based on sequence similarity, using an all-against-all BLAST search of each species' proteome, followed by normalization of inter-species differences, and Markov clustering. A total of 511 797 proteins (81.6% of the total dataset) were clustered into 70 388 ortholog groups. The ortholog database may be queried based on protein or group accession numbers, keyword descriptions or BLAST similarity. Ortholog groups exhibiting specific phyletic patterns may also be identified, using either a graphical interface or a text-based Phyletic Pattern Expression grammar. Information for ortholog groups includes the phyletic profile, the list of member proteins and a multiple sequence alignment, a statistical summary and graphical view of similarities, and a graphical representation of domain architecture. OrthoMCL software, the entire FASTA dataset employed and clustering results are available for download. OrthoMCL-DB provides a centralized warehouse for orthology prediction among multiple species, and will be updated and expanded as additional genome sequence data become available.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. A. Lee, R. Rentzsch, and C. Orengo GeMMA: functional subfamily classification within superfamilies of predicted protein structural domains Nucleic Acids Res., November 18, 2009; (2009) gkp1049v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Aurrecoechea, J. Brestelli, B. P. Brunk, S. Fischer, B. Gajria, X. Gao, A. Gingle, G. Grant, O. S. Harb, M. Heiges, et al. EuPathDB: a portal to eukaryotic pathogen databases Nucleic Acids Res., November 13, 2009; (2009) gkp941v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Lysenko, M. M. Hindle, J. Taubert, M. Saqi, and C. J. Rawlings Data integration for plant genomics--exemplars from the integration of Arabidopsis thaliana databases Brief Bioinform, November 1, 2009; 10(6): 676 - 693. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Wilkerson, Y. Ru, and V. P. Brendel Common introns within orthologous genes: software and application to plants Brief Bioinform, November 1, 2009; 10(6): 631 - 644. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Meyer, R. Overbeek, and A. Rodriguez FIGfams: yet another set of protein families Nucleic Acids Res., November 1, 2009; 37(20): 6643 - 6654. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. M. Nava, D. Y. Lee, J. H. Ospina, S.-Y. Cai, and H. R. Gaskins Genomic analyses reveal a conserved glutathione homeostasis pathway in the invertebrate chordate Ciona intestinalis Physiol Genomics, November 1, 2009; 39(3): 183 - 194. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Aslett, C. Aurrecoechea, M. Berriman, J. Brestelli, B. P. Brunk, M. Carrington, D. P. Depledge, S. Fischer, B. Gajria, X. Gao, et al. TriTrypDB: a functional genomic resource for the Trypanosomatidae Nucleic Acids Res., October 20, 2009; (2009) gkp851v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. S. Datta, C. Meacham, B. Samad, C. Neyer, and K. Sjolander Berkeley PHOG: PhyloFacts orthology group prediction web server Nucleic Acids Res., July 1, 2009; 37(suppl_2): W84 - W89. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Kuzniar, K. Lin, Y. He, H. Nijveen, S. Pongor, and J. A. M. Leunissen ProGMap: an integrated annotation resource for protein orthology Nucleic Acids Res., July 1, 2009; 37(suppl_2): W428 - W434. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Aurrecoechea, J. Brestelli, B. P. Brunk, J. M. Carlton, J. Dommer, S. Fischer, B. Gajria, X. Gao, A. Gingle, G. Grant, et al. GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia lamblia and Trichomonas vaginalis Nucleic Acids Res., January 1, 2009; 37(suppl_1): D526 - D530. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Hulsen, P. M. A. Groenen, J. de Vlieg, and W. Alkema PhyloPat: an updated version of the phylogenetic pattern database contains gene neighborhood Nucleic Acids Res., January 1, 2009; 37(suppl_1): D731 - D737. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Zerlotini, M. Heiges, H. Wang, R. L. V. Moraes, A. J. Dominitini, J. C. Ruiz, J. C. Kissinger, and G. Oliveira SchistoDB: a Schistosoma mansoni genome resource Nucleic Acids Res., January 1, 2009; 37(suppl_1): D579 - D582. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Pieper, N. Eswar, B. M. Webb, D. Eramian, L. Kelly, D. T. Barkan, H. Carter, P. Mankoo, R. Karchin, M. A. Marti-Renom, et al. MODBASE, a database of annotated comparative protein structure models and associated resources Nucleic Acids Res., January 1, 2009; 37(suppl_1): D347 - D354. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Aurrecoechea, J. Brestelli, B. P. Brunk, J. Dommer, S. Fischer, B. Gajria, X. Gao, A. Gingle, G. Grant, O. S. Harb, et al. PlasmoDB: a functional genomic database for malaria parasites Nucleic Acids Res., January 1, 2009; 37(suppl_1): D539 - D543. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. G. Tarcea, T. Weymouth, A. Ade, A. Bookvich, J. Gao, V. Mahavisno, Z. Wright, A. Chapman, M. Jayapandian, A. Ozgur, et al. Michigan molecular interactions r2: from interacting proteins to pathways Nucleic Acids Res., January 1, 2009; 37(suppl_1): D642 - D646. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Matthews, G. Gopinath, M. Gillespie, M. Caudy, D. Croft, B. de Bono, P. Garapati, J. Hemish, H. Hermjakob, B. Jassal, et al. Reactome knowledgebase of human biological pathways and processes Nucleic Acids Res., January 1, 2009; 37(suppl_1): D619 - D622. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Mochizuki, M. Novatchkova, and J. Loidl DNA double-strand breaks, but not crossovers, are required for the reorganization of meiotic nuclei in Tetrahymena J. Cell Sci., July 1, 2008; 121(13): 2148 - 2158. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-R. Ho, W.-J. Jang, C.-h. Chen, L.-Y. Ch'ang, and W.-c. Lin Designating eukaryotic orthology via processed transcription units Nucleic Acids Res., June 1, 2008; 36(10): 3436 - 3442. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Saito, M. Nishi, M. I. Lim, B. Wu, T. Maeda, H. Hashimoto, T. Takeuchi, D. S. Roos, and T. Asai A Novel GDP-dependent Pyruvate Kinase Isozyme from Toxoplasma gondii Localizes to Both the Apicoplast and the Mitochondrion J. Biol. Chem., May 16, 2008; 283(20): 14041 - 14052. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. V. Kriventseva, N. Rahman, O. Espinosa, and E. M. Zdobnov OrthoDB: the hierarchical catalog of eukaryotic orthologs Nucleic Acids Res., January 11, 2008; 36(suppl_1): D271 - D275. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Gajria, A. Bahl, J. Brestelli, J. Dommer, S. Fischer, X. Gao, M. Heiges, J. Iodice, J. C. Kissinger, A. J. Mackey, et al. ToxoDB: an integrated Toxoplasma gondii database resource Nucleic Acids Res., January 11, 2008; 36(suppl_1): D553 - D556. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Ruan, H. Li, Z. Chen, A. Coghlan, L. J. M. Coin, Y. Guo, J.-K. Heriche, Y. Hu, K. Kristiansen, R. Li, et al. TreeFam: 2008 Update Nucleic Acids Res., January 11, 2008; 36(suppl_1): D735 - D740. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Brilli, R. Fani, and P. Lio Current trends in the bioinformatic sequence analysis of metabolic pathways in prokaryotes Brief Bioinform, January 1, 2008; 9(1): 34 - 45. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Rannikko, C. Ortutay, and M. Vihinen Immunity genes and their orthologs: a multi-species database Int. Immunol., December 1, 2007; 19(12): 1361 - 1370. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Zhou and L. F. Landweber BLASTO: a tool for searching orthologous groups Nucleic Acids Res., July 13, 2007; 35(suppl_2): W678 - W682. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Greene, F. Collins, E. J. Lefkowitz, D. Roos, R. H. Scheuermann, B. Sobral, R. Stevens, O. White, and V. Di Francesco National Institute of Allergy and Infectious Diseases Bioinformatics Resource Centers: New Assets for Pathogen Informatics Infect. Immun., July 1, 2007; 75(7): 3212 - 3219. [Full Text] [PDF] |
||||
![]() |
C. Aurrecoechea, M. Heiges, H. Wang, Z. Wang, S. Fischer, P. Rhodes, J. Miller, E. Kraemer, C. J. Stoeckert Jr., D. S. Roos, et al. ApiDB: integrated resources for the apicomplexan bioinformatics resource center Nucleic Acids Res., January 12, 2007; 35(suppl_1): D427 - D430. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Jayapandian, A. Chapman, V. G. Tarcea, C. Yu, A. Elkiss, A. Ianni, B. Liu, A. Nandi, C. Santos, P. Andrews, et al. Michigan Molecular Interactions (MiMI): putting the jigsaw puzzle together Nucleic Acids Res., January 12, 2007; 35(suppl_1): D566 - D571. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Uchiyama MBGD: a platform for microbial comparative genomics based on the automated construction of orthologous groups Nucleic Acids Res., January 12, 2007; 35(suppl_1): D343 - D346. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M. Hulbert, L. J. Smink, E. C. Adlem, J. E. Allen, D. B. Burdick, O. S. Burren, C. C. Cavnor, G. E. Dolman, D. Flamez, K. F. Friery, et al. T1DBase: integration and presentation of complex data for type 1 diabetes research Nucleic Acids Res., January 12, 2007; 35(suppl_1): D742 - D746. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Wang, Y. Su, A. J. Mackey, E. T. Kraemer, and J. C. Kissinger SynView: a GBrowse-compatible approach to visualizing comparative genome data Bioinformatics, September 15, 2006; 22(18): 2308 - 2309. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. F. DeLuca, I-H. Wu, J. Pu, T. Monaghan, L. Peshkin, S. Singh, and D. P. Wall Roundup: a multi-genome repository of orthologs and evolutionary distances Bioinformatics, August 15, 2006; 22(16): 2044 - 2046. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. J. Penkett, J. A. Morris, V. Wood, and J. Bahler YOGY: a web-based, integrated database to retrieve protein orthologs and associated Gene Ontology terms. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W330 - W334. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. V. Date and C. J. Stoeckert Jr. Computational modeling of the Plasmodium falciparum interactome reveals protein function on a genome-wide scale Genome Res., April 1, 2006; 16(4): 542 - 549. [Abstract] [Full Text] [PDF] |
||||








