Nucleic Acids Research, 2007, Vol. 35, Database issue D224-D228
© 2006 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Articles |
New developments in the InterPro database
1 EMBL OutstationEuropean Bioinformatics Institute Hinxton, Cambridge, UK 2 Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus Hinxton, Cambridge, UK 3 Faculty of Life Sciences and School of Computer Science, University of Manchester Manchester, UK 4 Swiss Institute for Bioinformatics Geneva, Switzerland 5 Department of Structural Biology and Bioinformatics, University of Geneva Switzerland 6 Biocomputing Unit EMBL, Heidelberg Germany 7 Wellcome Trust Centre for Human Genetics, Oxford UK 8 CNRS/INRA, Toulouse France 9 Biochemistry and Molecular Biology Department, University College London University of London, UK 10 Genomic Sciences Centre, RIKEN Yokohama Institute Suehiro-cho, Tsurumi-ku, Yokohama, Japan 11 The Institute for Genomic Research, Rockville MD, USA 12 Laboratoire de Biomètrie et Biologie Evolutive and INRIA HELIX Project University Lyon 1, France 13 Evolutionary Systems Biology Group, SRI International Menlo Park, CA, USA 14 MRC Laboratory of Molecular Biology, Cambridge UK 15 Protein Information Resource, Georgetown University Medical Center Washington, DC, USA
*To whom correspondence should be addressed. Tel: +44 1223 494 602; Fax: +44 1223 494 468; Email: mulder{at}ebi.ac.uk
Received September 5, 2006. Revised October 6, 2006. Accepted October 6, 2006.
| ABSTRACT |
|---|
|
|
|---|
InterPro is an integrated resource for protein families, domains and functional sites, which integrates the following protein signature databases: PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D and PANTHER. The latter two new member databases have been integrated since the last publication in this journal. There have been several new developments in InterPro, including an additional reading field, new database links, extensions to the web interface and additional match XML files. InterPro has always provided matches to UniProtKB proteins on the website and in the match XML file on the FTP site. Additional matches to proteins in UniParc (UniProt archive) are now available for download in the new match XML files only. The latest InterPro release (13.0) contains more than 13 000 entries, covering over 78% of all proteins in UniProtKB. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro). The InterProScan search tool is now also available via a web service at http://www.ebi.ac.uk/Tools/webservices/WSInterProScan.html.
| INTRODUCTION |
|---|
|
|
|---|
InterPro (1) incorporates the major protein signature databases into a single resource. These include: PROSITE (2), which uses regular expressions and profiles, PRINTS (3), which uses Position Specific Scoring Matrix-based (PSSM-based) fingerprints, ProDom (4), which uses automatic sequence clustering, and Pfam (5), SMART (6), TIGRFAMs (7), PIRSF (8), SUPERFAMILY (9), Gene3D (10) and PANTHER (11), all of which use hidden Markov models (HMMs). Table 1 shows the coverage of each of these member databases. Protein signatures from these databases that describe the same family or domain, in terms of sequence positions and protein coverage are integrated into single InterPro entries, to which are added annotation and cross-references. Annotation includes an abstract, name, short name and GO terms (12) (where applicable). Cross-references are provided to specialized databases and protein structural information. All matches of the protein signatures contributed by member databases against the UniProt Knowledgbase (UniProtKB) (13) are calculated using the InterProScan software (14), which integrates the search algorithms from the member databases into a single package. The matches are available for viewing in various formats for each InterPro entry. The InterPro data are available for searching and retrieval via a web interface at http://www.ebi.ac.uk/interpro, and for download by anonymous FTP ftp://ftp.ebi.ac.uk/pub/databases/interpro.
|
InterPro is constantly being updated to keep up with the changing face of Bioinformatics. Two new member databases, PANTHER and Gene3D, have joined the InterPro consortium and their HMMs are being integrated. In addition, new database cross-references to CluSTr (15) and Pfam clans (5) have been included, and entries link to the IntAct molecular interaction database (16) where manually curated examples of domaindomain interactions are available. Proteins with 3D structures modelled by MODBASE (17) and SWISS-MODEL (18) have links to the structure predictions from the match graphical views. These links complement the experimentally determined structures in the protein data bank (PDB). The web interface has been extended for more advanced searching capabilities, and a web service is now available, providing programmatic access to InterProScan. In addition to UniProtKB, InterPro now provides matches to all proteins in the UniProt archive, UniParc, and these are currently available in XML format on the FTP site. The match XML files are also indexed in SRS to allow users to query the data within the SRS interface. The new features of InterPro are described in more detail below.
| NEW FEATURES OF INTERPRO |
|---|
|
|
|---|
Annotation
Two new member databases have been integrated into InterPro, PANTHER and Gene3D. PANTHER (http://www.pantherdb.org) (11) HMMs define protein families and subfamilies modelled on the divergence of specific functions within the families, which permits more accurate association with function based on ontology terms and pathways, as well as inference of amino acids important for functional specificity. PANTHER currently has high coverage of all families that contain at least one metazoan protein, including homologous proteins from all taxa. Consequently, coverage is very high for proteins found in animals and less so for other groups, such as plants, fungi and bacteria. The addition of PANTHER HMMs to InterPro is facilitating more fine-grained annotation of functionally and evolutionarily related subfamilies. Gene3D (http://cathwww.biochem.ucl.ac.uk:8080/Gene3D/) (10) is a library of HMMs that represent all proteins of known structure. The seed alignments for the models are derived from the proteins found within the homologous superfamily (H-level) classification level in CATH, which groups together domains that are thought to share a common ancestor. Gene3D models are being integrated to complement the SUPERFAMILY models that are based on SCOP superfamilies.
To further extend the publications section of InterPro entries, we have introduced the additional reading field. This field lists any publications provided by the member databases for the methods associated with each InterPro entry, which are not directly referenced in the InterPro abstract. Additionally, a maximum of five references per entry are taken from the PDB when one or more of the proteins in the entry has had its structure determined. These references provide the user with additional publications to visit to find out more about the proteins in the entry, and also provide InterPro curators with a list of references to consult when updating abstracts.
The database links field has been extended to include new links to CluSTr and Pfam clans. Table 2 lists the databases cross-referenced in InterPro and the number of entries containing these links. CluSTr (http://www.ebi.ac.uk/clustr) (15) is a database containing protein clusters from more than 368 organisms with completely sequenced genomes. The clustering is based on pairwise comparisons between the protein sequences. InterPro entries are linked to protein clusters only where at least 70% of the CluSTr members occur in the InterPro entry. Links to Pfam clan pages are now available in the database links field where applicable. A clan contains two or more Pfam families that have arisen from a single evolutionary origin, based on evidence from structure, function, profileprofile comparisons and whether the sequences are matched by more than one HMM. Clans were introduced to resolve the issue of Pfam HMMs overlapping on a sequence, as this is forbidden in the Pfam database. Clan information is used in post-processing of matches to remove these overlaps. The link from InterPro entries to clans provides a popup display of the Pfam clan name and all Pfam clan members with their corresponding InterPro accession numbers. These InterPro entries will not necessarily be related to each other through parent/child or contains/found in relationships.
|
Links to IntAct (http://www.ebi.ac.uk/intact/site/) (16), the molecular interaction database, have been incorporated into InterPro, providing manually curated examples of domaindomain interactions. IntAct incorporates proteinprotein interaction data derived from the literature and direct submissions, and provides a query interface and modules to analyze the data. Links from InterPro to IntAct are provided at the level of individual UniProtKB accessions, and are restricted to 20 randomly chosen examples. There are currently 135 InterPro entries with links to 1180 IntAct entries, involving
400 proteins. This number is likely to remain low, compared to the total number of interactions in IntAct, as these links are based on well curated domain interactions, rather than every proteinprotein interaction. New positional links are available for UniProtKB proteins to MODBASE (http://modbase.compbio.ucsf.edu/modbase-cgi-new/index.cgi) (17) and SWISS-MODEL (http://swissmodel.expasy.org/) (18). MODBASE is a database of 3D protein models calculated by comparative modelling using ModPipe, an automated modelling pipeline relying on programs, such as PSI-BLAST and MODELLER. MODBASE matches to protein sequences are shown in the detailed graphical view as yellow and white striped bars. SWISS-MODEL is a repository of annotated 3D protein structure models from the UniProtKB sequence database, and provides a protein structure homology modelling server. Matches to protein sequences are shown in the detailed graphical view as red and white striped bars. These cross-references, as well the other links to more than 30 different databases, increase the value of InterPro with respect to its interoperability and integration with other data sources.
Protein matches
Protein matches in InterPro are pre-calculated using the InterProScan software (14). InterProScan is a tool that combines different protein signature recognition methods of the InterPro member databases into one resource, and provides the corresponding InterPro accession numbers and GO annotation in the results. InterProScan can be used via a web interface or email server, which allows searching of a sequence against InterPro, or it can be installed and run locally for bulk searches. A new development has been the establishment of a web service for running single or multiple sequences through InterProScan. More information about the web service and example clients in Perl and Java for accessing the service is available from http://www.ebi.ac.uk/Tools/webservices/WSInterProScan.html. This service provides programmatic access to the tool for users who want to run bulk searches or use InterProScan as part of a pipeline.
Over the past two years, additional protein matches have become available in InterPro. Previously, InterPro matches were available only for UniProtKB proteins, but now InterPro provides additional matches to alternative splice products and UniParc proteins. Matches to splice variant sequences associated with UniProtKB accession numbers can be accessed through the protein with splice variants link from the Matches field, and are available through the compact and detailed displays. The matches for the master sequence are shown at the top with the splice variant matches below them, so it is easy to identify where matches differ between isoforms. The splice variant sequences originate from UniProtKB, and of the 25 927 splice variants available, 24 268 have hits to a total of 3483 InterPro entries.
The UniProt archive (UniParc) is a repository of all protein sequences, with each unique sequence stored once. These sequences are then cross-referenced to the relevant databases, e.g. UniProtKB, and include data submitted from metagenomics projects. This repository contains
7.5 million protein sequences, including UniProtKB proteins, and therefore the calculation of InterPro matches is slow. These calculations are ongoing, and the data provided incorporates the most up-to-date matches available at that point in time. Currently, there are just over 50 million InterPro matches to UniParc proteins. UniParc matches are not yet visible in InterPro entries, but are available in XML format from the FTP site and are searchable in SRS. An additional match XML file, match_complete.xml, is provided with each release, and contains UniProtKB sequence matches for all member database signatures, including those that have not yet been integrated into InterPro. This is to ensure that the public has access to all protein signature matches that have been calculated. All protein matches are updated on each major InterPro release (approximately every 3 months).
Web interface
The web interface has been extended to provide additional searching options. From the text search page (http://www.ebi.ac.uk/interpro/search.html) the user can search within InterPro entries or protein matches. One can retrieve matches for a UniProtKB accession number by pasting the accession number in the search box and selecting Find protein matches. This returns the matches in a combination of formats. The protein match views can also be selected in the Matches section of an InterPro entry, which provides options for displaying the matches in different tabular or graphical views. From any of these views, the user can then select a set of proteins by UniProtKB accession number(s) or InterPro accession, and can refine the set to show splice variants or proteins with known structure or both. Alternatively, the user can filter the protein set by taxonomy using the tax ID. Once the protein set has been defined, the user can select the output display format from compact, detailed, architectures or table, and can specify the order of proteins in the display by UniProtKB accession or identifier.
In addition to links to complete match lists, each InterPro entry page contains a taxonomy wheel showing the taxonomic range of proteins matching the entry. The numbers on the wheel for each taxonomic group are now clickable. Clicking on a particular lineage returns only the protein matches for the selected taxonomy. In this view, the species are sorted and displayed alphabetically and the lineage is shown at the top. The numbers on the phylogeny show the number of proteins associated with each taxonomic group that match the entry.
| DISCUSSION |
|---|
|
|
|---|
InterPro now integrates protein signatures from 10 different member databases, and links >20 additional resources, including UniProtKB, structural data and specialized protein family databases. It has proven its usefulness in the functional characterization of proteins, and is used by genome annotation projects (1922) and individual researchers worldwide. In the last year, the InterPro website received
3 million hits per month from up to 35 000 unique hosts. Through the mapping of InterPro entries to GO terms, InterPro contributes the majority of annotations of proteins to GO terms. Approximately 68% of all UniProtKB proteins are annotated with GO terms from a combination of manual annotation and the use of mappings, such as InterPro2GO, Swiss-Prot keyword2GO, etc. InterPro2GO alone provides GO annotations for 61% of UniProtKB proteins, thus accounting for a significant proportion of the total number of annotations currently available. These GO mappings are also available via InterProScan, which facilitates GO annotation to query proteins. The current release of InterPro contains more than 13 000 entries, with its signatures covering over 78% of UniProtKB proteins. The integration of new protein signatures from the existing and new member databases will continue to increase the coverage, as well as the depth, of InterPro. The InterPro database will continue to develop and increase its functionality. Future plans include the provision of protein match views for UniParc matches, facilitating the searching and browsing of InterPro entries by function, and the provision of data for unintegrated protein signatures via the InterPro web interface. Integration of signatures into InterPro entries and subsequent annotation of the entries is done manually and is thus of high-quality, but is time-consuming. In order to make the signatures awaiting integration available to the public via the web interface, new entries will be created automatically for the unintegrated signatures and will be searchable by their member database accession numbers. The protein matches will be available in the same format as match views from InterPro entries so that the user can see how the new signature relates to existing entries. These new features will increase the usefulness of this already popular high-quality resource.
| ACKNOWLEDGEMENTS |
|---|
The authors would like to thank Dr Steffen Schulze-Kremer and the HLRN staff for their continued and valuable assistance. InterPro is funded in part by the MRC e-family grant number G0100305. A large proportion of HMMER-based calculations are performed on the IBMP690 Supercomputer at HLRN. Funding to pay the Open Access publication charges for this article was provided by the European Bioinformatics Institute.
Conflict of interest statement. None declared.
| Footnotes |
|---|
Present address: Julian Gough, Unite de Bioinformatique Structurale, Institut Pasteur, Paris, France
| REFERENCES |
|---|
|
|
|---|
- Mulder, N.J., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bradley, P., Bork, P., Bucher, P., Cerutti, L., et al. (2005) InterPro, progress and status in 2005 Nucleic Acids Res, . 33, D201D205
[Abstract/Free Full Text] . - Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., De Castro, E., Langendijk-Genevaux, P.S., Pagni, M., Sigrist, C.J.A. (2006) The PROSITE database Nucleic Acids Res, . 34, D227D230
[Abstract/Free Full Text] . - Attwood, T.K., Bradley, P., Flower, D.R., Gaulton, A., Maudling, N., Mitchell, A.L., Moulton, G., Nordle, A., Paine, K., Taylor, P., et al. (2003) PRINTS and its automatic supplement, prePRINTS Nucleic Acids Res, . 31, 400402
[Abstract/Free Full Text] . - Bru, C., Courcelle, E., Carrere, S., Beausse, Y., Dalmar, S., Kahn, D. (2005) The ProDom database of protein domain families: more emphasis on 3D Nucleic Acids Res, . 33, D212D215
[Abstract/Free Full Text] . - Finn, R.D., Mistry, J., Schuster-Bockler, B., Griffiths-Jones, S., Hollich, V., Lassmann, T., Moxon, S., Marshall, M., Khanna, A., Durbin, R., et al. (2006) Pfam: clans, web tools and services Nucleic Acids Res, . 34, D247D251
[Abstract/Free Full Text] . - Letunic, I., Copley, R.R., Pils, B., Pinkert, S., Schultz, J., Bork, P. (2006) SMART 5: domains in the context of genomes and networks Nucleic Acids Res, . 34, D257D260
[Abstract/Free Full Text] . - Haft, D.H., Selengut, J.D., White, O. (2003) The TIGRFAMs database of protein families Nucleic Acids Res, . 31, 371373
[Abstract/Free Full Text] . - Wu, C.H., Nikolskaya, A., Huang, H., Yeh, L.S., Natale, D.A., Vinayaka, C.R., Hu, Z.Z., Mazumder, R., Kumar, S., Kourtesis, P., et al. (2004) PIRSF: family classification system at the Protein Information Resource Nucleic Acids Res, . 32, D112D114
[Abstract/Free Full Text] . - Gough, J., Karplus, K., Hughey, R., Chothia, C. (2001) Assignment of homology to genome sequences using a library of Hidden Markov Models that represent all proteins of known structure J. Mol. Biol, . 313, 903919[CrossRef][Web of Science][Medline] .
- Yeats, C., Maibaum, M., Marsden, R., Dibley, M., Lee, D., Addou, S., Orengo, C.A. (2006) Gene3D: modelling protein structure, function and evolution Nucleic Acids Res, . 34, D281D284
[Abstract/Free Full Text] . - Mi, H., Lazareva-Ulitsky, B., Loo, R., Kejariwal, A., Vandergriff, J., Rabkin, S., Guo, N., Muruganujan, A., Doremieux, O., Campbell, M.J., et al. (2005) The PANTHER database of protein families, subfamilies, functions and pathways Nucleic Acids Res, . 33, D284D288
[Abstract/Free Full Text] . - Harris, M.A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., Foulger, R., Eilbeck, K., Lewis, S., Marshall, B., Mungall, C., et al. (2004) The Gene Ontology (GO) database and informatics resource Nucleic Acids Res, . 32, D258D261
[Abstract/Free Full Text] . - Wu, C.H., Apweiler, R., Bairoch, A., Natale, D.A., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., et al. (2006) The Universal Protein Resource (UniProt): an expanding universe of protein information Nucleic Acids Res, . 34, D187D191
[Abstract/Free Full Text] . - Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N., Apweiler, R., Lopez, R. (2005) InterProScan: protein domains identifier Nucleic Acids Res, . 33, W116W120
[Abstract/Free Full Text] . - Petryszak, P., Kretschmann, E., Wieser, D., Apweiler, R. (2005) The predictive power of the CluSTr database Bioinformatics, 21, 36043609
[Abstract/Free Full Text] . - Hermjakob, H., Montecchi-Palazzi, L., Lewington, C., Mudali, S., Kerrien, S., Orchard, S., Vingron, M., Roechert, B., Roepstorff, P., Valencia, A., et al. (2004) IntAct: an open source molecular interaction database Nucleic Acids Res, . 32, D452D455
[Abstract/Free Full Text] . - Pieper, U., Eswar, N., Braberg, H., Madhusudhan, M.S., Davis, F., Stuart, A.C., Mirkovic, N., Rossi, A., Marti-Renom, M.A., Fiser, A., et al. (2004) MODBASE, a database of annotated comparative protein structure models, and associated resources Nucleic Acids Res, . 32, D217D222
[Abstract/Free Full Text] . - Kopp, J. and Schwede, T. (2006) The SWISS-MODEL Repository: new features and functionalities Nucleic Acids Res, . 34, D315D318
[Abstract/Free Full Text] . - The International Human Genome Consortium. (2001) Initial sequencing and analysis of the human genome Nature, 409, 860921[CrossRef][Medline] .
- Kawaji, H., Schonbach, C., Matsuo, Y., Kawai, J., Okazaki, Y., Hayashizaki, Y., Matsuda, H. (2002) Exploration of novel motifs derived from mouse cDNA sequences Genome Res, . 12, 367378
[Abstract/Free Full Text] . - Yu, J., Hu, S., Wang, J., Wong, G.K., Li, S., Liu, B., Deng, Y., Dai, L., Zhou, Y., Zhang, X., et al. (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica) Science, 296, 7992
[Abstract/Free Full Text] . - Rubin, G.M., Yandell, M.D., Wortman, J.R., Gabor Miklos, G.L., Nelson, C.R., Hariharan, I.K., Fortini, M.E., LiP, W., Apweiler, R., Fleischmann, W., et al. (2000) Comparative genomics of the eukaryotes Science, 287, 22042215
[Abstract/Free Full Text] .
This article has been cited by other articles:
![]() |
N. Terrapon, O. Gascuel, E. Marechal, and L. Brehelin Detection of new protein domains using co-occurrence: application to Plasmodium falciparum Bioinformatics, December 1, 2009; 25(23): 3077 - 3083. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Meyer, R. Overbeek, and A. Rodriguez FIGfams: yet another set of protein families Nucleic Acids Res., November 1, 2009; 37(20): 6643 - 6654. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gu, Y. Wang, and T. Lilburn A Comparative Genomics, Network-Based Approach to Understanding Virulence in Vibrio cholerae J. Bacteriol., October 15, 2009; 191(20): 6262 - 6272. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Gazzaniga, R. Stebbins, S. Z. Chang, M. A. McPeek, and C. Brenner Microbial NAD Metabolism: Lessons from Comparative Genomics Microbiol. Mol. Biol. Rev., September 1, 2009; 73(3): 529 - 541. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Kuzniar, K. Lin, Y. He, H. Nijveen, S. Pongor, and J. A. M. Leunissen ProGMap: an integrated annotation resource for protein orthology Nucleic Acids Res., July 1, 2009; 37(suppl_2): W428 - W434. [Abstract] [Full Text] [PDF] |
||||
![]() |
F.-C. Chen, Y.-Z. Chen, and T.-J. Chuang CNVVdb: a database of copy number variations across vertebrate genomes Bioinformatics, June 1, 2009; 25(11): 1419 - 1421. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. K. Basu, E. Poliakov, and I. B. Rogozin Domain mobility in proteins: functional and evolutionary implications Brief Bioinform, May 1, 2009; 10(3): 205 - 216. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Talavera, R. A. Laskowski, and J. M. Thornton WSsas: a web service for the annotation of functional residues through structural homologues Bioinformatics, May 1, 2009; 25(9): 1192 - 1194. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Iwaniak and J. Dziuba Analysis of Domains in Selected Plant and Animal Food Proteins - Precursors of Biologically Active Peptides - In Silico Approach Food Science and Technology International, April 1, 2009; 15(2): 179 - 191. [Abstract] [PDF] |
||||
![]() |
P. G. Leiman, M. Basler, U. A. Ramagopal, J. B. Bonanno, J. M. Sauder, S. Pukatzki, S. K. Burley, S. C. Almo, and J. J. Mekalanos From the Cover: Type VI secretion apparatus and phage tail-associated protein complexes share a common evolutionary origin PNAS, March 17, 2009; 106(11): 4154 - 4159. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A Reeves, D. Talavera, and J. M Thornton Genome and proteome annotation: organization, interpretation and integration J R Soc Interface, February 6, 2009; 6(31): 129 - 147. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Mulvenna, B. Hamilton, S. H. Nagaraj, D. Smyth, A. Loukas, and J. J. Gorman Proteomics Analysis of the Excretory/Secretory Component of the Blood-feeding Stage of the Hookworm, Ancylostoma caninum Mol. Cell. Proteomics, January 1, 2009; 8(1): 109 - 121. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Ogata, N. Sakurai, K. Aoki, H. Suzuki, K. Okazaki, K. Saito, and D. Shibata KAGIANA: An Excel-Based Tool for Retrieving Summary Information on Arabidopsis Genes Plant Cell Physiol., January 1, 2009; 50(1): 173 - 177. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. P. Davis, C. G. Murphy, C. A. Saraceni-Richards, M. C. Rosenstein, T. C. Wiegers, and C. J. Mattingly Comparative Toxicogenomics Database: a knowledgebase and discovery tool for chemical-gene-disease networks Nucleic Acids Res., January 1, 2009; 37(suppl_1): D786 - D792. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Zerlotini, M. Heiges, H. Wang, R. L. V. Moraes, A. J. Dominitini, J. C. Ruiz, J. C. Kissinger, and G. Oliveira SchistoDB: a Schistosoma mansoni genome resource Nucleic Acids Res., January 1, 2009; 37(suppl_1): D579 - D582. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Klimke, R. Agarwala, A. Badretdin, S. Chetvernin, S. Ciufo, B. Fedorov, B. Kiryutin, K. O'Neill, W. Resch, S. Resenchuk, et al. The National Center for Biotechnology Information's Protein Clusters Database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D216 - D223. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Hunter, R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, D. Binns, P. Bork, U. Das, L. Daugherty, L. Duquenne, et al. InterPro: the integrative protein signature database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D211 - D215. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Tweedie, M. Ashburner, K. Falls, P. Leyland, P. McQuilton, S. Marygold, G. Millburn, D. Osumi-Sutherland, A. Schroeder, R. Seal, et al. FlyBase: enhancing Drosophila Gene Ontology annotations Nucleic Acids Res., January 1, 2009; 37(suppl_1): D555 - D559. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Barrell, E. Dimmer, R. P. Huntley, D. Binns, C. O'Donovan, and R. Apweiler The GOA database in 2009--an integrated Gene Ontology Annotation resource Nucleic Acids Res., January 1, 2009; 37(suppl_1): D396 - D403. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Ding, P. Lorenz, M. Kreutzer, Y. Li, and H.-J. Thiesen SysZNF: the C2H2 zinc finger gene database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D267 - D273. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Fey, P. Gaudet, T. Curk, B. Zupan, E. M. Just, S. Basu, S. N. Merchant, Y. A. Bushmanova, G. Shaulsky, W. A. Kibbe, et al. dictyBase--a Dictyostelium bioinformatics resource update Nucleic Acids Res., January 1, 2009; 37(suppl_1): D515 - D519. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Letunic, T. Doerks, and P. Bork SMART 6: recent updates and new developments Nucleic Acids Res., January 1, 2009; 37(suppl_1): D229 - D232. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bhasi, P. Philip, V. Manikandan, and P. Senapathy ExDom: an integrated database for comparative analysis of the exon-intron structures of protein domains in eukaryotes Nucleic Acids Res., January 1, 2009; 37(suppl_1): D703 - D711. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. McDowall, M. S. Scott, and G. J. Barton PIPs: human protein-protein interaction prediction database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D651 - D656. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Shionyu, A. Yamaguchi, K. Shinoda, K.-i. Takahashi, and M. Go AS-ALPS: a database for analyzing the effects of alternative splicing on protein structure, interaction and network in human and mouse Nucleic Acids Res., January 1, 2009; 37(suppl_1): D305 - D309. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Droc, C. Perin, S. Fromentin, and P. Larmande OryGenesDB 2008 update: database interoperability for functional genomics of rice Nucleic Acids Res., January 1, 2009; 37(suppl_1): D992 - D995. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Mabey Gilsenan, G. Atherton, J. Bartholomew, P. F. Giles, T. K. Attwood, D. W. Denning, and P. Bowyer Aspergillus Genomes and the Aspergillus Cloud Nucleic Acids Res., January 1, 2009; 37(suppl_1): D509 - D514. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. E. Vinogradov Modularity of cellular networks shows general center-periphery polarization Bioinformatics, December 15, 2008; 24(24): 2814 - 2817. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-P. Gagne, M. Isabelle, K. S. Lo, S. Bourassa, M. J. Hendzel, V. L. Dawson, T. M. Dawson, and G. G. Poirier Proteome-wide identification of poly(ADP-ribose) binding proteins and poly(ADP-ribose)-associated protein complexes Nucleic Acids Res., December 1, 2008; 36(22): 6959 - 6976. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Carver, M. Berriman, A. Tivey, C. Patel, U. Bohme, B. G. Barrell, J. Parkhill, and M.-A. Rajandream Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database Bioinformatics, December 1, 2008; 24(23): 2672 - 2676. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. B. Rosenblum, J. E. Stajich, N. Maddox, and M. B. Eisen Global gene expression profiles for life stages of the deadly amphibian pathogen Batrachochytrium dendrobatidis PNAS, November 4, 2008; 105(44): 17034 - 17039. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Stockinger, T. Attwood, S. N. Chohan, R. Cote, P. Cudre-Mauroux, L. Falquet, P. Fernandes, R. D. Finn, T. Hupponen, E. Korpelainen, et al. Experience using web services for biological sequence analysis Brief Bioinform, November 1, 2008; 9(6): 493 - 505. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. J. G. van de Werken, M. R. A. Verhaart, A. L. VanFossen, K. Willquist, D. L. Lewis, J. D. Nichols, H. P. Goorissen, E. F. Mongodin, K. E. Nelson, E. W. J. van Niel, et al. Hydrogenomics of the Extremely Thermophilic Bacterium Caldicellulosiruptor saccharolyticus Appl. Envir. Microbiol., November 1, 2008; 74(21): 6720 - 6729. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Yang, U. C. Kalluri, S. Jawdy, L. E. Gunter, T. Yin, T. J. Tschaplinski, D. J. Weston, P. Ranjan, and G. A. Tuskan The F-Box Gene Family Is Expanded in Herbaceous Annual Plants Relative to Woody Perennial Plants Plant Physiology, November 1, 2008; 148(3): 1189 - 1200. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Furney, B. Calvo, P. Larranaga, J. A. Lozano, and N. Lopez-Bigas Prioritization of candidate cancer genes--an aid to oncogenomic studies Nucleic Acids Res., October 1, 2008; 36(18): e115 - e115. [Abstract] [Full Text] [PDF] |
||||
![]() |
P.-J. Cao, L. E. Bartley, K.-H. Jung, and P. C. Ronald Construction of a Rice Glycosyltransferase Phylogenomic Database and Identification of Rice-Diverged Glycosyltransferases Mol Plant, September 1, 2008; 1(5): 858 - 877. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Wohlbrand, H. Wilkes, T. Halder, and R. Rabus Anaerobic Degradation of p-Ethylphenol by "Aromatoleum aromaticum" Strain EbN1: Pathway, Regulation, and Involved Proteins J. Bacteriol., August 15, 2008; 190(16): 5699 - 5709. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Azcarate-Peril, E. Altermann, Y. J. Goh, R. Tallon, R. B. Sanozky-Dawes, E. A. Pfeiler, S. O'Flaherty, B. L. Buck, A. Dobson, T. Duong, et al. Analysis of the Genome Sequence of Lactobacillus gasseri ATCC 33323 Reveals the Molecular Basis of an Autochthonous Intestinal Organism Appl. Envir. Microbiol., August 1, 2008; 74(15): 4610 - 4625. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. B. Quintaje and S. Orchard The Annotation of Both Human and Mouse Kinomes in UniProtKB/Swiss-Prot: One Small Step in Manual Annotation, One Giant Leap for Full Comprehension of Genomes Mol. Cell. Proteomics, August 1, 2008; 7(8): 1409 - 1419. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sato, Y. Nakamura, T. Kaneko, E. Asamizu, T. Kato, M. Nakao, S. Sasamoto, A. Watanabe, A. Ono, K. Kawashima, et al. Genome Structure of the Legume, Lotus japonicus DNA Res, August 1, 2008; 15(4): 227 - 239. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Forslund and E. L. L. Sonnhammer Predicting protein function from domain content Bioinformatics, August 1, 2008; 24(15): 1681 - 1687. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Turner, E. B. Chuong, and H. E. Hoekstra Comparative Analysis of Testis Protein Evolution in Rodents Genetics, August 1, 2008; 179(4): 2075 - 2089. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-M. Bourbon Comparative genomics supports a deep evolutionary origin for the large, four-module transcriptional mediator complex Nucleic Acids Res., July 1, 2008; 36(12): 3993 - 4008. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Loewenstein, E. Portugaly, M. Fromer, and M. Linial Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space Bioinformatics, July 1, 2008; 24(13): i41 - i49. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Vaughan, S.-Y. Chiu, G. Ramasamy, L. Li, M. J. Gardner, A. S. Tarun, S. H.I. Kappe, and X. Peng Assessment and improvement of the Plasmodium yoelii yoelii genome annotation through comparative analysis Bioinformatics, July 1, 2008; 24(13): i383 - i389. [Abstract] [Full Text] [PDF] |
||||
![]() |
L.-C. Tranchevent, R. Barriot, S. Yu, S. Van Vooren, P. Van Loo, B. Coessens, B. De Moor, S. Aerts, and Y. Moreau ENDEAVOUR update: a web resource for gene prioritization in multiple species Nucleic Acids Res., July 1, 2008; 36(suppl_2): W377 - W384. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Al-Shahrour, J. Carbonell, P. Minguez, S. Goetz, A. Conesa, J. Tarraga, I. Medina, E. Alloza, D. Montaner, and J. Dopazo Babelomics: advanced functional profiling of transcriptomics, proteomics and genomics experiments Nucleic Acids Res., July 1, 2008; 36(suppl_2): W341 - W346. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Banerjee, M. Gretes, L. Basuino, N. Strynadka, and H. F. Chambers In Vitro Selection and Characterization of Ceftobiprole-Resistant Methicillin-Resistant Staphylococcus aureus Antimicrob. Agents Chemother., June 1, 2008; 52(6): 2089 - 2096. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Gotz, J. M. Garcia-Gomez, J. Terol, T. D. Williams, S. H. Nagaraj, M. J. Nueda, M. Robles, M. Talon, J. Dopazo, and A. Conesa High-throughput functional annotation and data mining with the Blast2GO suite Nucleic Acids Res., June 1, 2008; 36(10): 3420 - 3435. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Patient, D. Wieser, M. Kleen, E. Kretschmann, M. Jesus Martin, and R. Apweiler UniProtJAPI: a remote API for accessing UniProt data Bioinformatics, May 15, 2008; 24(10): 1321 - 1322. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Sczyrba, S. Konermann, and R. Giegerich Two interactive Bioinformatics courses at the Bielefeld University Bioinformatics Server Brief Bioinform, May 1, 2008; 9(3): 243 - 249. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Iida, M. Shionyu, and Y. Suso Alternative Splicing at NAGNAG Acceptor Sites Shares Common Properties in Land Plants and Mammals Mol. Biol. Evol., April 1, 2008; 25(4): 709 - 718. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. V. Tetko, I. V. Rodchenkov, M. C. Walter, T. Rattei, and H.-W. Mewes Beyond the 'best' match: machine learning annotation of protein sequences by integration of different sources of information Bioinformatics, March 1, 2008; 24(5): 621 - 628. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Salgado, G. Gimenez, F. Coulier, and C. Marcelle COMPARE, a multi-organism system for cross-species data comparison and transfer of information Bioinformatics, February 1, 2008; 24(3): 447 - 449. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-L. Faulon, M. Misra, S. Martin, K. Sale, and R. Sapra Genome scale enzyme metabolite and drug target interaction predictions using the signature molecular descriptor Bioinformatics, January 15, 2008; 24(2): 225 - 233. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kaneko, N. Nakajima, S. Okamoto, I. Suzuki, Y. Tanabe, M. Tamaoki, Y. Nakamura, F. Kasai, A. Watanabe, K. Kawashima, et al. Complete Genomic Structure of the Bloom-forming Toxic Cyanobacterium Microcystis aeruginosa NIES-843 DNA Res, January 11, 2008; (2008) dsm026v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Raghavachari, A. Tasneem, T. M. Przytycka, and R. Jothi DOMINE: a database of protein domain interactions Nucleic Acids Res., January 11, 2008; 36(suppl_1): D656 - D661. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Saunders, S. Lyon, M. Day, B. Riley, E. Chenette, and S. Subramaniam The Molecule Pages database Nucleic Acids Res., January 11, 2008; 36(suppl_1): D700 - D706. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. R. Davila, P. N. Mendes, G. Wagner, D. A. Tschoeke, R. R. C. Cuadrat, F. Liberman, L. Matos, T. Satake, K. A. C. S. Ocana, O. Triana, et al. ProtozoaDB: dynamic visualization and exploration of protozoan genomes Nucleic Acids Res., January 11, 2008; 36(suppl_1): D547 - D552. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Matsuya, R. Sakate, Y. Kawahara, K. O. Koyanagi, Y. Sato, Y. Fujii, C. Yamasaki, T. Habara, H. Nakaoka, F. Todokoro, et al. Evola: Ortholog database of all human genes in H-InvDB with manual curation of phylogenetic trees Nucleic Acids Res., January 11, 2008; 36(suppl_1): D787 - D792. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Liang, P. Jaiswal, C. Hebbard, S. Avraham, E. S. Buckler, T. Casstevens, B. Hurwitz, S. McCouch, J. Ni, A. Pujar, et al. Gramene: a growing plant comparative genomics resource Nucleic Acids Res., January 11, 2008; 36(suppl_1): D947 - D953. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Heger, E. Korpelainen, T. Hupponen, K. Mattila, V. Ollikainen, and L. Holm PairsDB atlas of protein sequence space Nucleic Acids Res., January 11, 2008; 36(suppl_1): D276 - D280. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Pagel, M. Oesterheld, O. Tovstukhina, N. Strack, V. Stumpflen, and D. Frishman DIMA 2.0 predicted and known domain interactions Nucleic Acids Res., January 11, 2008; 36(suppl_1): D651 - D655. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Gajria, A. Bahl, J. Brestelli, J. Dommer, S. Fischer, X. Gao, M. Heiges, J. Iodice, J. C. Kissinger, A. J. Mackey, et al. ToxoDB: an integrated Toxoplasma gondii database resource Nucleic Acids Res., January 11, 2008; 36(suppl_1): D553 - D556. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Yeats, J. Lees, A. Reid, P. Kellam, N. Martin, X. Liu, and C. Orengo Gene3D: comprehensive structural and functional annotation of genomes Nucleic Acids Res., January 11, 2008; 36(suppl_1): D414 - D418. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Lechat, L. Hummel, S. Rousseau, and I. Moszer GenoList: an integrated environment for comparative analysis of microbial genomes Nucleic Acids Res., January 11, 2008; 36(suppl_1): D469 - D474. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Rattei, P. Tischler, R. Arnold, F. Hamberger, J. Krebs, J. Krumsiek, B. Wachinger, V. Stumpflen, and W. Mewes SIMAP structuring the network of protein similarities Nucleic Acids Res., January 11, 2008; 36(suppl_1): D289 - D292. [Abstract] [Full Text] [PDF] |
||||
![]() |
Genome Information Integration Project And H-Invit The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts Nucleic Acids Res., January 11, 2008; 36(suppl_1): D793 - D799. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Jung, M. Staton, T. Lee, A. Blenda, R. Svancara, A. Abbott, and D. Main GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data Nucleic Acids Res., January 11, 2008; 36(suppl_1): D1034 - D1040. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Brilli, R. Fani, and P. Lio Current trends in the bioinformatic sequence analysis of metabolic pathways in prokaryotes Brief Bioinform, January 1, 2008; 9(1): 34 - 45. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Birzele, R. Kuffner, F. Meier, F. Oefinger, C. Potthast, and R. Zimmer ProSAS: a database for analyzing alternative splicing in the context of protein structures Nucleic Acids Res., January 1, 2008; 36(suppl_1): D63 - D68. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
















