Skip Navigation

Nucleic Acids Research 2006 34(Database Issue):D504-D506; doi:10.1093/nar/gkj126
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (852K) Freely available
Right arrow Screen PDF (137K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Bader, G. D.
Right arrow Articles by Sander, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bader, G. D.
Right arrow Articles by Sander, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2006, Vol. 34, Database issue D504-D506
© The Author 2006. Published by Oxford University Press. All rights reserved
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions{at}oxfordjournals.org


Article

Pathguide: a Pathway Resource List

Gary D. Bader, Michael P. Cary and Chris Sander*

Computational Biology Center, Memorial Sloan-Kettering Cancer Center 1275 York Avenue, Box 460, New York, NY 10021, USA

*To whom correspondence should be addressed. Tel: +1 646 735 8079; Fax: +1 646 735 0021; Email: pathguide{at}cbio.mskcc.org

Received October 20, 2005. Accepted October 21, 2005.


    ABSTRACT
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 
Pathguide: the Pathway Resource List (http://pathguide.org) is a meta-database that provides an overview of more than 190 web-accessible biological pathway and network databases. These include databases on metabolic pathways, signaling pathways, transcription factor targets, gene regulatory networks, genetic interactions, protein–compound interactions, and protein–protein interactions. The listed databases are maintained by diverse groups in different locations and the information in them is derived either from the scientific literature or from systematic experiments. Pathguide is useful as a starting point for biological pathway analysis and for content aggregation in integrated biological information systems.


    MOTIVATION
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 
Databases and methods for computational analysis and data mining are an increasingly integral part of biological research because they provide easy access to a wealth of biological information, which biologists require to support analyses and effectively answer research questions. While individual databases cover important information in certain areas of biological knowledge, integrated and reasonably comprehensive use of biological pathway and network datasets is severely hampered by the large number and fragmentation of available databases (1).


    CREATING THE PATHWAY RESOURCE LIST
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 
As a step toward effective navigation and selective pathway data integration, we have collected information on over 190 published online cellular pathway and network databases in Pathguide. By providing this community resource we aim to promote the development of an integrated view of the cell (the ‘cell map’) (1). While much cell map data are only found in the literature, the databases in the list already represent a significant amount of relatively well-organized pathway data at varying levels of accessibility that can eventually be integrated and comprehensively accessed.


    TYPES OF DATA AND DATABASE CATEGORIES
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 
Databases in the list are grouped into eight major categories based on the type of data made available, the data format and the biological focus (Table 1). The categories are approximate and a database can be in multiple categories if it contains multiple data types. Protein–protein interaction databases mainly store pairwise interactions or complexes between proteins and sometimes other molecular interaction types. Metabolic pathway databases generally store a series of biochemical reactions in pathways involved in metabolite conversions. Signaling pathway databases generally collect sets of molecular interactions and chemical modifications (such as post-translational protein modifications) as regulatory pathways. Gene regulation network databases capture transcription factors and the genes they regulate. Genetic pathway databases are composed of genetic interactions, such as epistasis and synthetic lethality, which occur when two mutations have a combined phenotypic effect that is not simply the sum of the effects caused by either mutation alone. Pathway diagram databases generally store hyperlinked pathway images; while it is difficult to extract computable information from these images, they are very useful for biologists as educational and quick reference tools. Essentially all databases listed focus on interactions or pathways/networks. However, we also include a number of protein-sequence databases that store pathway information as secondary information, e.g. the REBASE database of restriction enzymes contains information about catalytic events involving DNA.


View this table:
[in this window]
[in a new window]
 
Table 1 Pathguide statistics

 

    COVERAGE
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 
As one might expect, the categories of information captured in databases, so far, are biased by community biological interests and do not evenly cover the space of available pathway and interaction data (Figure 1). For instance, there are many protein–protein and protein–compound databases, plausibly because of technical developments in proteomics and interest in drug discovery, but there appear to be only two protein–RNA databases and none for RNA–compound interactions, although these categories are clearly of biological interest. Interactions and pathways define biological function at the molecular level. Therefore, pathway databases must grow to support the evolution of biological knowledge. There is still significant room for pathway database growth in underrepresented categories and areas of new biological discoveries, such as microRNA targets.



View larger version (21K):
[in this window]
[in a new window]
 
Figure 1 The 40 largest databases in Pathguide. Pathway databases are diverse, both in volume of data (Database size, vertical) and in the attention they appear to generate (Popularity Estimate, horizontal). Database size is the sum of interaction and pathway records (where available; some database statistics may be incomplete). Popularity Estimate is taken to be the number of web pages in Google that mention the database homepage URL. Five main database categories (see symbol legend) out of eight are represented in the top 40. Databases in multiple categories are only represented in one (arbitrarily chosen) category. See the Pathguide website for expanded database names.

 

    META-DATA AND LINKS
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 
Database names in Pathguide are linked to the database homepage and clicking on ‘more’ next to each database leads to a structured description of the database listing short name, full name, homepage Uniform Resource Locator (URL), last observed date, text description, sample data URL, availability (e.g. free to all users, license purchase required), PubMed links, Pathguide category, types of tools available, database statistics, organisms covered and a popularity measure. The popularity measure used is the number of web pages, as indexed by the Google Internet search engine, that mention a given pathway database homepage URL. A user can rank all databases by this measure by clicking ‘Order list by web popularity’ at the top of the Pathguide homepage. The measure is rough because not all websites mentioning a pathway database are relevant. Thus the exact ranking is likely not sound, but it is useful as an overview of the most popular set of databases in each category.


    STANDARDS
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 
Many computational pathway analysis methods gain power given a larger biological network. A single new link can lead to a significant new biological discovery. Collecting as many high-quality links as possible for network analysis requires well organized and convenient pathway database access. Databases that are freely accessible (open-access) and support standard languages facilitate their distribution and use. To encourage this, databases are highlighted if they are ‘Free to all users’ and can be downloaded in a standard format, such as the Proteomics Standards Initiative Molecular Interaction (2) and BioPAX (www.biopax.org) pathway data exchange standards and the Systems Biology Markup Language (SBML) (3) and CellML (4) pathway simulation model exchange standards. Biologists looking for data that can be analyzed in tools supporting these standards and computational biologists wishing to integrate available pathway data for global analyses may find this helpful.


    IMPLEMENTATION
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 
Pathguide is curated by the authors and regularly updated. Generation of web pages (HTML) is implemented using the scripting language PHP with a relational database (MySQL) backend, which stores all information in a structured manner. The Google ranking for a particular database is updated using a Perl script to query the Google SOAP Application Programming Interface for pages anywhere on the Internet that link to the database homepage address (URL), not counting links from the database site itself. We have designed Pathguide to facilitate research on biological pathways and to be complementary to existing database link resources, such as Michael Galperin's Molecular Biology Database Collection (5) and the UBiC Bioinformatics Links Directory (http://bioinformatics.ubc.ca/resources/links_directory).


    FUTURE DEVELOPMENT
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 
Future plans include adding search features (e.g. show me all databases with human protein–protein interaction information), automatically updated database content statistics (where available), graphical Pathguide content summaries, improved links to PubMed, differentiating primary (original content) and secondary (derived or predicted content) databases, automatic URL validation, homepage uptime statistics, an RSS feed to track updates and to include a section on pathway tools, a commonly requested feature. Comments, questions and information about missing pathway resources are most welcome.


    ACKNOWLEDGEMENTS
 
Thanks to Robert Hoffmann for input on the Google ranking, Emek Demir for manuscript comments, users who have submitted new pathway resources and the editors of Nucleic Acids Research for supporting open-access publications. Funding to pay the Open Access publication charges for this article was provided by Memorial Sloan-Kettering Cancer Center.

Conflict of interest statement. None declared.


    REFERENCES
 TOP
 ABSTRACT
 MOTIVATION
 CREATING THE PATHWAY RESOURCE...
 TYPES OF DATA AND...
 COVERAGE
 META-DATA AND LINKS
 STANDARDS
 IMPLEMENTATION
 FUTURE DEVELOPMENT
 REFERENCES
 

  1. Cary, M.P., Bader, G.D., Sander, C. (2005) Pathway information for systems biology FEBS Lett, . 579, 1815–1820[CrossRef][Web of Science][Medline] .

  2. Hermjakob, H., Montecchi-Palazzi, L., Bader, G., Wojcik, J., Salwinski, L., Ceol, A., Moore, S., Orchard, S., Sarkans, U., von Mering, C., et al. (2004) The HUPO PSI's molecular interaction format—a community standard for the representation of protein interaction data Nat. Biotechnol, . 22, 177–183[CrossRef][Web of Science][Medline] .

  3. Hucka, M., Finney, A., Sauro, H.M., Bolouri, H., Doyle, J.C., Kitano, H., Arkin, A.P., Bornstein, B.J., Bray, D., Cornish-Bowden, A., et al. (2003) The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models Bioinformatics, 19, 524–531[Abstract/Free Full Text] .

  4. Lloyd, C.M., Halstead, M.D., Nielsen, P.F. (2004) CellML: its future, present and past Prog. Biophys. Mol. Biol, . 85, 433–450[CrossRef][Web of Science][Medline] .

  5. Galperin, M.Y. (2005) The Molecular Biology Database Collection: 2005 update Nucleic Acids Res, . 33, D5–D24[Abstract/Free Full Text] .


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief BioinformHome page
C. Soderlund
Computational techniques for elucidating plant-pathogen interactions from large-scale experiments on fungi and oomycetes
Brief Bioinform, November 1, 2009; 10(6): 654 - 663.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Kandasamy, S. Keerthikumar, R. Raju, T. S. Keshava Prasad, Y. L. Ramachandra, S. Mohan, and A. Pandey
PathBuilder--open source software for annotating and developing pathway resources
Bioinformatics, November 1, 2009; 25(21): 2860 - 2862.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. Y. Geer, A. Marchler-Bauer, R. C. Geer, L. Han, J. He, S. He, C. Liu, W. Shi, and S. H. Bryant
The NCBI BioSystems database
Nucleic Acids Res., October 23, 2009; (2009) gkp858v1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. Aranda, P. Achuthan, Y. Alam-Faruque, I. Armean, A. Bridge, C. Derow, M. Feuermann, A. T. Ghanbarian, S. Kerrien, J. Khadake, et al.
The IntAct molecular interaction database in 2010
Nucleic Acids Res., October 22, 2009; (2009) gkp878v1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Blankenburg, F. Ramirez, J. Buch, and M. Albrecht
DASMIweb: online integration, analysis and assessment of distributed protein interaction data
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W122 - W128.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Kamburov, C. Wierling, H. Lehrach, and R. Herwig
ConsensusPathDB--a database for integrating human functional interaction networks
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D623 - D628.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Elliott, M. Kirac, A. Cakmak, G. Yavas, S. Mayes, E. Cheng, Y. Wang, C. Gupta, G. Ozsoyoglu, and Z. Meral Ozsoyoglu
PathCase: pathways database system
Bioinformatics, November 1, 2008; 24(21): 2526 - 2533.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
Y. J. Huang, D. Hang, L. J. Lu, L. Tong, M. B. Gerstein, and G. T. Montelione
Targeting the Human Cancer Pathway Protein Interaction Network by Structural Genomics
Mol. Cell. Proteomics, October 1, 2008; 7(10): 2048 - 2060.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. D. Wren and A. Bateman
Databases, data tombs and dust in the wind
Bioinformatics, October 1, 2008; 24(19): 2127 - 2128.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
B. S. Srinivasan, N. H. Shah, J. A. Flannick, E. Abeliuk, A. F. Novak, and S. Batzoglou
Current progress in network research: toward reference networks for key model organisms
Brief Bioinform, September 1, 2007; 8(5): 318 - 332.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y.-B. Chen, A. Chattopadhyay, P. Bergen, C. Gadd, and N. Tannery
The Online Bioinformatics Resources Collection at the University of Pittsburgh Health Sciences Library System--a one-stop gateway to online bioinformatics databases and software tools
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D780 - D785.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. A. George, J. Y. Liu, L. L. Feng, R. J. Bryson-Richardson, D. Fatkin, and M. A. Wouters
Analysis of protein sequence and interaction data for candidate disease gene prediction
Nucleic Acids Res., November 14, 2006; 34(19): e130 - e130.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (852K) Freely available
Right arrow Screen PDF (137K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Bader, G. D.
Right arrow Articles by Sander, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bader, G. D.
Right arrow Articles by Sander, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?