Nucleic Acids Research, 2003, Vol. 31, No. 1 315-318
© 2003 Oxford University Press
The InterPro Database, 2003 brings increased coverage and new features
1 EMBL OutstationEuropean Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK 2 Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK 3 School of Biological Sciences and Department of Computer Science, The University of Manchester, Manchester, UK 4 Swiss Institute for Bioinformatics, Geneva, Switzerland 5 ViaLactia Biosciences, Newmarket Auckland, New Zealand 6 Biocomputing Unit EMBL, Heidelberg, Germany 7 Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland 8 Wellcome Trust Centre for Human Genetics, Oxford, UK 9 CNRS/INRA, Toulouse, France 10 The Institute for Genomic Research, MD, USA 11 MRC Functional Genetics Unit, Department of Human Anatomy and Genetics, University of Oxford, UK 12 EMBL, Heidelberg, Germany
*To whom correspondence should be addressed. Tel: +44 1223 494602; Fax: +44 1223 494468; Email: mulder{at}ebi.ac.uk
Received September 16, 2002; Revised and Accepted October 2, 2002
ABSTRACT
InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of InterPro contains 5629 entries describing 4280 families, 1239 domains, 95 repeats and 15 post-translational modifications. Currently, the combined signatures in InterPro cover more than 74% of all proteins in SWISS-PROT and TrEMBL, an increase of nearly 15% since the inception of InterPro. New features of the database include improved searching capabilities and enhanced graphical user interfaces for visualisation of the data. The database is available via a webserver (http://www.ebi.ac.uk/interpro) and anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).
BACKGROUND
Protein signature databases, based on several different methods, have evolved with the need for efficient automatic methods of protein sequence classification and characterisation. In 1999, the major signature databases PROSITE (1), PRINTS (2), Pfam (3) and ProDom (4) formed a Consortium and agreed to integrate their data into a new database that became known as InterPro (5). Subsequently SMART (6) and TIGRFAMs (7) have joined the Consortium. The Consortium has agreed on the free availability and distribution of the data and protein sequence search methods, and free, efficient flow of information between the member databases and InterPro, as well as among themselves.
Signatures from the member databases are integrated manually at regular intervals by a team of biologists, whose role is also to annotate the new or existing entries. Each InterPro entry is described by one or more signatures, corresponding to a biologically meaningful family, domain, repeat or PTM. Two types of relationships can exist between InterPro entries: the parent/child and contains/found in relationship. Parent/child relationships are used to describe a common ancestry between entries whereas the contains/found in relationship generally refers to the presence of genetically mobile domains. All hits of the protein signatures in InterPro against a composite of the SWISS-PROT and TrEMBL databases (8) (SPTR) are precomputed. The matches are available for viewing in each InterPro entry in different formats including a match table, a detailed graphical view and a condensed graphical view.
There have been a number of improvements to the InterPro database since its inception, including increased coverage, additional features of the search tools, and a new look web interface. These are described in more detail below.
MORE ENTRIES AND INCREASED COVERAGE
The first official release of InterPro in October 1999 contained 2990 entries and covered 60.2% of all SPTR protein sequences. The latest release of the database contains 5629 entries, an increase of 2639 entries, or a doubling in just 3 years. A summary of the InterPro release and the coverage of the signatures in SPTR are shown in Table 1. On average, there has been an increase of 500600 new entries per release, which does not necessarily correspond with the number of new signatures, since many may overlap with existing entries represented by other member databases.
|
The coverage of SPTR by InterPro signatures has increased by nearly 15%, a significant figure considering that the SPTR databases themselves have increased from 279 794 to 734 448 protein sequences over the same period of time. There may be an overlap in coverage by entries which are children of or found in other entries, so a protein may hit several entries. The coverage of InterPro in complete proteomes ranges from 64% to 74% in eukaryotes, with a coverage of 73.5% of the non-redundant human proteome, and averages
6668% in prokaryotes, with some having a coverage of up to 75%. Mostly a hit to InterPro provides useful functional information, however, there are
370 entries that describe proteins of unknown function and hence prevent inference of function. However, these entries do group related proteins and if one protein in the entry is biochemically characterised then this may shed light on the function of the related proteins. NEW FEATURES
Several new features have been introduced into InterPro since the last publication in this journal in 2000. On the annotation side, InterPro entries have been mapped to Gene Ontology (GO) (10) terms where a term applies to all proteins matching that entry. Not all entries can be mapped due to low specificity in function or process, but for those that can this provides a powerful tool for automatic large scale annotation of proteins to GO terms. Currently, 4102 InterPro entries have been mapped to 1899 unique GO terms, which results in automatic GO assignment to 405 684 unique proteins in SPTR.
A notable improvement in InterPro has been in the searching capabilities. The sequence search package, InterProScan (11), has been extended to include all new member databases and data, and the Perl stand-alone version has additional features, including allowance for GO annotation, and the potential to plug in the transmembrane and signal peptide prediction programs TMHMM (12) and SignalP (13) respectively. InterProScan is available for interactive as well as email sequence submissions. Additional files, for example a list of all InterPro entries, a list of InterPro to GO mappings and a summary of all protein matches are now available on the FTP site. The text search capabilities have been extended to both a simple text search and an SRS-based (14) search facility for more complex queries.
InterPro has developed an improved user interface for visualisation of the protein matches in a condensed graphical view derived from the ProDom graphical interface (4). The consensus domain boundaries are computed, and the resulting protein matches are combined rather than each signature being displayed (Fig. 1A,B). Parent/child related InterPro entries are collapsed into one line, while domain entries are shown on separate line, thereby providing a simple view of family and domain composition. From this view, all proteins sharing a common domain architecture can be grouped, and the sequences aligned and visualised using Jalview (http://www.ebi.ac.uk/~michele/jalview/) or DisplayFam (15). Recently, the general web interface for InterPro has been developed, and changes reflect style changes to the EBI web server. A useful addition to the pages is the option to display them as simple HTML, a printer-friendly version, XML and the default view with or without the menu.
|
DISCUSSION
The amalgamation of the major protein signature databases into InterPro has proven to be an enormous success, and has produced a powerful tool for protein sequence analysis and characterisation. The tools and data have numerous applications described in more detail elsewhere (16), and InterPro has been the tool of choice for the annotation of new genomes, including the human genome (17). Future plans involve integration of the next database, PIR superfamilies (18), which facilitate protein family information retrieval, identification of domain and family relationships and classification of multi domain proteins. In addition, there are plans for expansion into the field of protein secondary and tertiary structure. Protein structure information is vital in understanding protein function and evolutionary relationships. A project has been initiated to rationalise the data of SCOP (Structural Classification of Proteins) (19), CATH (Class, Architecture, Topology, Homology) (20), and SWISS-MODEL 3D structure homology models (21) with that of InterPro. This integration will enhance the capability of the database in the field of protein classification and characterisation and make the database, a true integrated resource for complete protein sequence and structure information.
The InterPro database is available via a webserver (http://www.ebi.ac.uk/interpro) and anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).
SUPPLEMENTARY MATERIAL
Supplementary Material is available at NAR Online.
ACKNOWLEDGEMENT
The InterPro project is supported by the ProFuSe grant (no. QLG2-CT-2000-00517) of the European Commission.
REFERENCES
- Falquet,L., Pagni,M., Bucher,P., Hulo,N., Sigrist,C.J.A., Hofmann,K. and Bairoch,A. (2002) The PROSITE database, its status in 2002. Nucleic Acids Res., 30, 235238.
[Abstract/Free Full Text] - Attwood,T.K., Blythe,M.J., Flower,D.R., Gaulton,A., Mabey,J.E., Maudling,N., McGregor,L., Mitchell,A.L., Moulton,G., Paine,K. et al. (2002) PRINTS and PRINTS-S shed light on protein ancestry. Nucleic Acids Res., 30, 239241.
[Abstract/Free Full Text] - Bateman,A., Birney,E., Cerruti,L., Durbin,R., Etwiller,L., Eddy,S.R., Griffiths-Jones,S., Howe,K.L., Marshall,M. and Sonnhammer,E.L.L. (2002) The Pfam Protein Families Database. Nucleic Acids Res., 30, 276280.
[Abstract/Free Full Text] - Corpet,F., Servant,F., Gouzy,J. and Kahn,D. (2000) ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Res., 28, 267269.
[Abstract/Free Full Text] - Apweiler,R., Attwood,T.K., Bairoch,A., Bateman,A., Birney,E., Biswas,M., Bucher,P., Cerutti,L., Corpet,F., Croning,M.D. et al. (2001) The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res., 29, 3740.
[Abstract/Free Full Text] - Letunic,I., Goodstadt,L., Dickens,N.J., Doerks,T., Schultz,J., Mott,R., Ciccarelli,F., Copley,R.R., Ponting,C.P. and Bork,P. (2002) Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res., 30, 242244.
[Abstract/Free Full Text] - Haft,D.H., Loftus,B.J., Richardson,D.L., Yang,F., Eisen,J.A., Paulsen,I.T. and White,O. (2001) TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res., 29, 4143.
[Abstract/Free Full Text] - Bairoch,A. and Apweiler,R. (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res., 28, 4548.
[Abstract/Free Full Text] - Doerks,T., Copley,R.R., Schultz,J., Ponting,C.P. and Bork,P. (2002) Systematic identification of novel protein domain families associated with nuclear functions. Genome Res., 12, 4756.
[Abstract/Free Full Text] - The Gene Ontology Consortium (2001) Creating the gene ontology resource: design and implementation. Genome Res., 11, 14251433.
[Abstract/Free Full Text] - Zdobnov,E.M. and Apweiler,R. (2001) InterProScanan integration platform for the signature-recognition methods in InterPro. Bioinformatics, 17, 847848.
[Abstract/Free Full Text] - Krogh,A., Larsson,B., von Heijne,G. and Sonnhammer,E.L. (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol., 305, 567580.[CrossRef][Web of Science][Medline]
- Nielsen,H., Engelbrecht,J., Brunak,S. and von Heijne,G. (1997) A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Int. J. Neural Syst., 8, 581599.[CrossRef][Medline]
- Etzold,T., Ulyanov,A. and Argos,P. (1996) SRS: information retrieval system for molecular biology data banks. Methods Enzymol., 266, 114128.[Web of Science][Medline]
- Corpet,F., Gouzy,J. and Kahn,D. (1999) Browsing protein families via the Rich Family Description format. Bioinformatics, 15, 10201027.
[Abstract/Free Full Text] - Biswas,M., O'Rourke,J.F., Camon,E., Fraser,G., Kanapin,A., Karavidopoulou,Y., Kersey,P., Kriventseva,E., Mittard,V., Mulder,N. et al. (2002) Applications of InterPro in protein annotation and genome analysis. Brief. Bioinform., 3, 285295.
[Abstract/Free Full Text] - The International Human Genome Consortium (2001) Initial sequencing and analysis of the human genome. Nature, 409, 860921.[CrossRef][Medline]
- Wu,C.H., Xiao,C., Hou,Z., Huang,H. and Barker,W.C. (2001) iProClass: an integrated, comprehensive and annotated protein classification database. Nucleic Acids Res., 29, 5254.
[Abstract/Free Full Text] - Lo Conte,L., Brenner,S.E., Hubbard,T.J., Chothia,C. and Murzin,A.G. (2002) SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res., 30, 264267.
[Abstract/Free Full Text] - Pearl,F.M., Lee,D., Bray,J.E., Buchan,D.W., Shepherd,A.J. and Orengo,C.A. (2002) The CATH extended protein-family database: providing structural annotations for genome sequences. Protein Sci., 11, 233244.[CrossRef][Web of Science][Medline]
- Guex,N. and Peitsch,M.C. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modelling. Electrophoresis, 18, 27142723.[CrossRef][Web of Science][Medline]
This article has been cited by other articles:
![]() |
L. A. Mueller, R. K. Lankhorst, S. D. Tanksley, J. J. Giovannoni, R. White, J. Vrebalov, Z. Fei, J. van Eck, R. Buels, A. A. Mills, et al. A Snapshot of the Emerging Tomato Genome Sequence The Plant Genome, March 1, 2009; 2(1): 78 - 92. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Ahmad, F.-M. Boisvert, P. Gregor, A. Cobley, and A. I. Lamond NOPdb: Nucleolar Proteome Database--2008 update Nucleic Acids Res., January 1, 2009; 37(suppl_1): D181 - D184. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Cappellazzo, L. Lanfranco, M. Fitz, D. Wipf, and P. Bonfante Characterization of an Amino Acid Permease from the Endomycorrhizal Fungus Glomus mosseae Plant Physiology, May 1, 2008; 147(1): 429 - 437. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. M. Maier, M. S. Casey, R. H. Becker, C. W. Dorsey, E. M. Glass, N. Maltsev, T. C. Zahrt, and D. W. Frank Identification of Francisella tularensis Himar1-Based Transposon Mutants Defective for Replication in Macrophages Infect. Immun., November 1, 2007; 75(11): 5376 - 5389. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Wierling, R. Herwig, and H. Lehrach Resources, standards and tools for systems biology Brief Funct Genomic Proteomic, October 17, 2007; (2007) elm027v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Gewehr, V. Hintermair, and R. Zimmer AutoSCOP: automated prediction of SCOP classifications using unique pattern-class mappings Bioinformatics, May 15, 2007; 23(10): 1203 - 1210. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Wegmann, M. O'Connell-Motherway, A. Zomer, G. Buist, C. Shearman, C. Canchaya, M. Ventura, A. Goesmann, M. J. Gasson, O. P. Kuipers, et al. Complete Genome Sequence of the Prototype Lactic Acid Bacterium Lactococcus lactis subsp. cremoris MG1363 J. Bacteriol., April 15, 2007; 189(8): 3256 - 3270. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker, C. Huthmacher, F. Ramirez, T. Lengauer, and M. Albrecht Functional evaluation of domain domain interactions and human protein interaction networks Bioinformatics, April 1, 2007; 23(7): 859 - 865. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ciftci-Yilmaz, M. R. Morsy, L. Song, A. Coutu, B. A. Krizek, M. W. Lewis, D. Warren, J. Cushman, E. L. Connolly, and R. Mittler The EAR-motif of the Cys2/His2-type Zinc Finger Protein Zat7 Plays a Key Role in the Defense Response of Arabidopsis to Salinity Stress J. Biol. Chem., March 23, 2007; 282(12): 9260 - 9268. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Nikolski and D. J. Sherman Family relationships: should consensus reign?--consensus clustering for protein families Bioinformatics, January 15, 2007; 23(2): e71 - e76. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D'Souza, E. M. Glass, M. H. Syed, Y. Zhang, A. Rodriguez, N. Maltsev, and M. Y. Galperin Sentra: a database of signal transduction proteins for comparative genome analysis Nucleic Acids Res., January 12, 2007; 35(suppl_1): D271 - D273. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Sprinzak, Y. Altuvia, and H. Margalit Colloquium Papers: Characterization and prediction of protein-protein interactions within and between complexes PNAS, October 3, 2006; 103(40): 14718 - 14723. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Lin, L. Zhu, and D.-Y. Zhang An initial strategy for comparing proteins at the domain architecture level Bioinformatics, September 1, 2006; 22(17): 2081 - 2086. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Lott, B. Paget, J. M. Johnston, L. T. J. Delbaere, J. A. Sigrell-Simon, M. J. Banfield, and E. N. Baker The Structure of an Ancient Conserved Domain Establishes a Structural Basis for Stable Histidine Phosphorylation and Identifies a New Family of Adenosine-specific Kinases J. Biol. Chem., August 4, 2006; 281(31): 22131 - 22141. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Terribilini, J.-H. Lee, C. Yan, R. L. Jernigan, V. Honavar, and D. Dobbs Prediction of RNA binding sites in proteins from amino acid sequence RNA, August 1, 2006; 12(8): 1450 - 1462. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Escasa, H. A. M. Lauzon, A. C. Mathur, P. J. Krell, and B. M. Arif Sequence analysis of the Choristoneura occidentalis granulovirus genome J. Gen. Virol., July 1, 2006; 87(7): 1917 - 1933. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. P. C. Rodrigues, B. J. Grant, and R. E. Hubbard sgTarget: a target selection resource for structural genomics. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W225 - W230. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Pazos, A. Rausell, and A. Valencia Phylogeny-independent detection of functional residues Bioinformatics, June 15, 2006; 22(12): 1440 - 1448. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. D. Newcomb, R. N. Crowhurst, A. P. Gleave, E. H.A. Rikkerink, A. C. Allan, L. L. Beuning, J. H. Bowen, E. Gera, K. R. Jamieson, B. J. Janssen, et al. Analyses of Expressed Sequence Tags from Apple Plant Physiology, May 1, 2006; 141(1): 147 - 166. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. M. Short and T. C. Cox Subclassification of the RBCC/TRIM Superfamily Reveals a Novel Motif Necessary for Microtubule Binding J. Biol. Chem., March 31, 2006; 281(13): 8970 - 8980. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. L Marsden, J. A.G Ranea, A. Sillero, O. Redfern, C. Yeats, M. Maibaum, D. Lee, S. Addou, G. A Reeves, T. J Dallman, et al. Exploiting protein structure data to explore the evolution of protein function and biological complexity Phil Trans R Soc B, March 29, 2006; 361(1467): 425 - 440. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Narlikar and A. J. Hartemink Sequence features of DNA binding sites reveal structural class of associated transcription factor Bioinformatics, January 15, 2006; 22(2): 157 - 163. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Gewehr and R. Zimmer SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles Bioinformatics, January 15, 2006; 22(2): 181 - 187. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Sharov, D. B. Dudekula, and M. S. H. Ko CisView: A Browser and Database of cis-regulatory Modules Predicted in the Mouse Genome DNA Res, January 1, 2006; 13(3): 123 - 134. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. H. Saier Jr, C. V. Tran, and R. D. Barabote TCDB: the Transporter Classification Database for membrane transport protein analyses and information Nucleic Acids Res., January 1, 2006; 34(suppl_1): D181 - D186. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. K. L. Leung, L. Trinkle-Mulcahy, Y. W. Lam, J. S. Andersen, M. Mann, and A. I. Lamond NOPdb: Nucleolar Proteome Database Nucleic Acids Res., January 1, 2006; 34(suppl_1): D218 - D220. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Tagari, J. Tate, G. J. Swaminathan, R. Newman, A. Naim, W. Vranken, A. Kapopoulou, A. Hussain, J. Fillon, K. Henrick, et al. E-MSD: improving data deposition and structure quality Nucleic Acids Res., January 1, 2006; 34(suppl_1): D287 - D290. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Gajendran, M. D. Gonzales, A. Farmer, E. Archuleta, J. Win, M. E. Waugh, and S. Kamoun Phytophthora functional genomics database (PFGD): functional genomics of phytophthora-plant interactions Nucleic Acids Res., January 1, 2006; 34(suppl_1): D465 - D470. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Hartmann, D. Lu, J. Phillips, and T. J. Vision Phytome: a platform for plant comparative genomics Nucleic Acids Res., January 1, 2006; 34(suppl_1): D724 - D730. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Gouget, V. Senchou, F. Govers, A. Sanson, A. Barre, P. Rouge, R. Pont-Lezica, and H. Canut Lectin Receptor Kinases Participate in Protein-Protein Interactions to Mediate Plasma Membrane-Cell Wall Adhesions in Arabidopsis Plant Physiology, January 1, 2006; 140(1): 81 - 90. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Maiorino, A. Roveri, L. Benazzi, V. Bosello, P. Mauri, S. Toppo, S. C. E. Tosatto, and F. Ursini Functional Interaction of Phospholipid Hydroperoxide Glutathione Peroxidase with Sperm Mitochondrion-associated Cysteine-rich Protein Discloses the Adjacent Cysteine Motif as a New Substrate of the Selenoperoxidase J. Biol. Chem., November 18, 2005; 280(46): 38395 - 38402. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Zhao, L. Fang, N. Chen, R. C. Johnsen, L. Stein, and D. L. Baillie Distinct Regulatory Elements Mediate Similar Expression Patterns in the Excretory Cell of Caenorhabditis elegans J. Biol. Chem., November 18, 2005; 280(46): 38787 - 38794. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Thieme, R. Koebnik, T. Bekel, C. Berger, J. Boch, D. Buttner, C. Caldana, L. Gaigalat, A. Goesmann, S. Kay, et al. Insights into Genome Plasticity and Pathogenicity of the Plant Pathogenic Bacterium Xanthomonas campestris pv. vesicatoria Revealed by the Complete Genome Sequence J. Bacteriol., November 1, 2005; 187(21): 7254 - 7266. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bekaert, H. Richard, B. Prum, and J.-P. Rousset Identification of programmed translational -1 frameshifting sites in the genome of Saccharomyces cerevisiae Genome Res., October 1, 2005; 15(10): 1411 - 1420. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Mao, T. Cai, J. G. Olyarchuk, and L. Wei Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary Bioinformatics, October 1, 2005; 21(19): 3787 - 3793. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Petryszak, E. Kretschmann, D. Wieser, and R. Apweiler The predictive power of the CluSTr database Bioinformatics, September 15, 2005; 21(18): 3604 - 3609. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Wagner, Q. H. Tran, H. Richter, P. M. Selzer, and G. Unden Pyruvate Fermentation by Oenococcus oeni and Leuconostoc mesenteroides and Role of Pyruvate Dehydrogenase in Anaerobic Fermentation Appl. Envir. Microbiol., September 1, 2005; 71(9): 4966 - 4971. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Shen, M. Mascarenhas, R. Morgan, K. Rahn, and M. A. Karmali Identification of Four Fimbria-Encoding Genomic Islands That Are Highly Specific for Verocytotoxin-Producing Escherichia coli Serotype O157 Strains J. Clin. Microbiol., August 1, 2005; 43(8): 3840 - 3850. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Boekhorst, M. W. H. J. de Been, M. Kleerebezem, and R. J. Siezen Genome-Wide Detection and Analysis of Cell Wall-Bound Proteins with LPxTG-Like Sorting Motifs J. Bacteriol., July 15, 2005; 187(14): 4928 - 4934. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Yang, C.-Q. Lai, L. Parnell, L. A. Cupples, X. Adiconis, Y. Zhu, P. W. F. Wilson, D. E. Housman, A. M. Shearman, R. B. D'Agostino, et al. Genome-wide linkage analyses and candidate gene fine mapping for HDL3 cholesterol: the Framingham Study J. Lipid Res., July 1, 2005; 46(7): 1416 - 1425. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Pillai, V. Silventoinen, K. Kallio, M. Senger, S. Sobhany, J. Tate, S. Velankar, A. Golovin, K. Henrick, P. Rice, et al. SOAP-based services provided by the European Bioinformatics Institute Nucleic Acids Res., July 1, 2005; 33(suppl_2): W25 - W28. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Laskowski, J. D. Watson, and J. M. Thornton ProFunc: a server for predicting protein function from 3D structure Nucleic Acids Res., July 1, 2005; 33(suppl_2): W89 - W93. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Krishnadev, N. Rekha, S. B. Pandit, S. Abhiman, S. Mohanty, L. S. Swapna, S. Gore, and N. Srinivasan PRODOC: a resource for the comparison of tethered protein domain architectures with in-built information on remotely related domain families Nucleic Acids Res., July 1, 2005; 33(suppl_2): W126 - W129. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Al-Shahrour, P. Minguez, J. M. Vaquerizas, L. Conde, and J. Dopazo BABELOMICS: a suite of web tools for functional annotation and analysis of groups of genes in high-throughput experiments Nucleic Acids Res., July 1, 2005; 33(suppl_2): W460 - W464. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Cruveiller, J. Le Saux, D. Vallenet, A. Lajus, S. Bocs, and C. Medigue MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes Nucleic Acids Res., July 1, 2005; 33(suppl_2): W471 - W479. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Vaquerizas, L. Conde, P. Yankilevich, A. Cabezon, P. Minguez, R. Diaz-Uriarte, F. Al-Shahrour, J. Herrero, and J. Dopazo GEPAS, an experiment-oriented pipeline for the analysis of microarray gene expression data Nucleic Acids Res., July 1, 2005; 33(suppl_2): W616 - W620. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Mueller, T. H. Solow, N. Taylor, B. Skwarecki, R. Buels, J. Binns, C. Lin, M. H. Wright, R. Ahrens, Y. Wang, et al. The SOL Genomics Network. A Comparative Resource for Solanaceae Biology and Beyond Plant Physiology, July 1, 2005; 138(3): 1310 - 1317. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Ramos, M. Martinez-Bueno, A. J. Molina-Henares, W. Teran, K. Watanabe, X. Zhang, M. T. Gallegos, R. Brennan, and R. Tobes The TetR Family of Transcriptional Repressors Microbiol. Mol. Biol. Rev., June 1, 2005; 69(2): 326 - 356. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Poulsen, B. Wu, R. F. Gaber, and M. C. Kielland-Brandt Constitutive Signal Transduction by Mutant Ssy5p and Ptr3p Components of the SPS Amino Acid Sensor System in Saccharomyces cerevisiae Eukaryot. Cell, June 1, 2005; 4(6): 1116 - 1124. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. V. Kriventseva, A. C. Koutsos, C. Blass, F. C. Kafatos, G. K. Christophides, and E. M. Zdobnov AnoEST: Toward A. gambiae functional genomics Genome Res., June 1, 2005; 15(6): 893 - 899. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Tremblay, F. Ouellet, J. Fournier, J. Danyluk, and F. Sarhan Molecular Characterization and Origin of Novel Bipartite Cold-regulated Ice Recrystallization Inhibition Proteins from Cereals Plant Cell Physiol., June 1, 2005; 46(6): 884 - 891. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. Tuckwell, M. J. Nicholson, C. S. McSweeney, M. K. Theodorou, and J. L. Brookman The rapid assignment of ruminal fungi to presumptive genera using ITS1 and ITS2 RNA secondary structures to produce group-specific fingerprints Microbiology, May 1, 2005; 151(5): 1557 - 1567. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Divoli and T. K. Attwood BioIE: extracting informative sentences from the biomedical literature Bioinformatics, May 1, 2005; 21(9): 2138 - 2139. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Hao, W.-Z. He, Y. Huang, L.-X. Ma, Y. Xu, H. Xi, C. Wang, B.-S. Liu, J.-M. Wang, Y.-X. Li, et al. MPSS: an integrated database system for surveying a set of proteins Bioinformatics, May 1, 2005; 21(9): 2142 - 2143. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Yuan, S. Ouyang, A. Wang, W. Zhu, R. Maiti, H. Lin, J. Hamilton, B. Haas, R. Sultana, F. Cheung, et al. The Institute for Genomic Research Osa1 Rice Genome Annotation Database Plant Physiology, May 1, 2005; 138(1): 18 - 26. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Schneider, A. Bairoch, C. H. Wu, and R. Apweiler Plant Protein Annotation in the UniProt Knowledgebase Plant Physiology, May 1, 2005; 138(1): 59 - 66. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Wolstencroft, R. McEntire, R. Stevens, L. Tabernero, and A. Brass Constructing ontology-driven protein family databases Bioinformatics, April 15, 2005; 21(8): 1685 - 1692. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Lefebvre, J.-C. Aude, E. Glemet, and C. Neri Balancing protein similarity and gene co-expression reveals new links between genetic conservation and developmental diversity in invertebrates Bioinformatics, April 15, 2005; 21(8): 1550 - 1558. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Wisser, Q. Sun, S. H. Hulbert, S. Kresovich, and R. J. Nelson Identification and Characterization of Regions of the Rice Genome Associated With Broad-Spectrum, Quantitative Disease Resistance Genetics, April 1, 2005; 169(4): 2277 - 2293. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-H. Chiu, P. Tang, C. Chu, S. Hu, Q. Bao, J. Yu, Y.-Y. Chou, H.-S. Wang, and Y.-S. Lee The genome sequence of Salmonella enterica serovar Choleraesuis, a highly invasive and resistant zoonotic pathogen Nucleic Acids Res., March 21, 2005; 33(5): 1690 - 1698. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Sandelin and W. W. Wasserman Prediction of Nuclear Hormone Receptor Response Elements Mol. Endocrinol., March 1, 2005; 19(3): 595 - 606. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Fukui, H. Atomi, T. Kanai, R. Matsumi, S. Fujiwara, and T. Imanaka Complete genome sequence of the hyperthermophilic archaeon Thermococcus kodakaraensis KOD1 and comparison with Pyrococcus genomes Genome Res., March 1, 2005; 15(3): 352 - 363. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Muramatsu, H. Mihara, R. Kakutani, M. Yasuda, M. Ueda, T. Kurihara, and N. Esaki The Putative Malate/Lactate Dehydrogenase from Pseudomonas putida Is an NADPH-dependent {Delta}1-Piperideine-2-carboxylate/{Delta}1-Pyrroline-2-carboxylate Reductase Involved in the Catabolism of D-Lysine and D-Proline J. Biol. Chem., February 18, 2005; 280(7): 5329 - 5335. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M. Zdobnov, Món. Campillos, E. D. Harrington, D. Torrents, and P. Bork Protein coding potential of retroviruses and other transposable elements in vertebrate genomes Nucleic Acids Res., February 16, 2005; 33(3): 946 - 954. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Kolker, A. F. Picone, M. Y. Galperin, M. F. Romine, R. Higdon, K. S. Makarova, N. Kolker, G. A. Anderson, X. Qiu, K. J. Auberry, et al. Global profiling of Shewanella oneidensis MR-1: Expression of hypothetical genes and improved functional annotations PNAS, February 8, 2005; 102(6): 2099 - 2104. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Iida, M. Seki, T. Sakurai, M. Satou, K. Akiyama, T. Toyoda, A. Konagaya, and K. Shinozaki RARTF: Database and Tools for Complete Sets of Arabidopsis Transcription Factors. DNA Res, January 1, 2005; 12(4): 247 - 256. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Kanz, P. Aldebert, N. Althorpe, W. Baker, A. Baldwin, K. Bates, P. Browne, A. van den Broek, M. Castro, G. Cochrane, et al. The EMBL Nucleotide Sequence Database Nucleic Acids Res., January 1, 2005; 33(suppl_1): D29 - D33. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Brooksbank, G. Cameron, and J. Thornton The European Bioinformatics Institute's data resources: towards systems biology Nucleic Acids Res., January 1, 2005; 33(suppl_1): D46 - D53. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Kim, N. Kim, Y. Lee, B. Kim, Y. Shin, and S. Lee ECgene: genome annotation for alternative splicing Nucleic Acids Res., January 1, 2005; 33(suppl_1): D75 - D79. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-D. Huang, J.-T. Horng, F.-M. Lin, Y.-C. Chang, and C.-C. Huang SpliceInfo: an information repository for mRNA alternative splicing in human genome Nucleic Acids Res., January 1, 2005; 33(suppl_1): D80 - D85. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Hubbard, D. V. Grafham, K. J. Beattie, I. M. Overton, S. R. McLaren, M. D.R. Croning, P. E. Boardman, J. K. Bonfield, J. Burnside, R. M. Davies, et al. Transcriptome analysis for the chicken based on 19,626 finished cDNA sequences and 485,337 expressed sequence tags Genome Res., January 1, 2005; 15(1): 174 - 183. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Mignone, G. Grillo, F. Licciulli, M. Iacono, S. Liuni, P. J. Kersey, J. Duarte, C. Saccone, and G. Pesole UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs Nucleic Acids Res., January 1, 2005; 33(suppl_1): D141 - D146. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bairoch, R. Apweiler, C. H. Wu, W. C. Barker, B. Boeckmann, S. Ferro, E. Gasteiger, H. Huang, R. Lopez, M. Magrane, et al. The Universal Protein Resource (UniProt) Nucleic Acids Res., January 1, 2005; 33(suppl_1): D154 - D159. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. J. Su, L. Lu, S. Saxonov, and D. L. Brutlag eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity Nucleic Acids Res., January 1, 2005; 33(suppl_1): D178 - D182. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Heger, C. A. Wilton, A. Sivakumar, and L. Holm ADDA: a domain database with global coverage of the protein universe Nucleic Acids Res., January 1, 2005; 33(suppl_1): D188 - D191. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Abhiman and E. L. L. Sonnhammer FunShift: a database of function shift analysis on protein subfamilies Nucleic Acids Res., January 1, 2005; 33(suppl_1): D197 - D200. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Bru, E. Courcelle, S. Carrere, Y. Beausse, S. Dalmar, and D. Kahn The ProDom database of protein domain families: more emphasis on 3D Nucleic Acids Res., January 1, 2005; 33(suppl_1): D212 - D215. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Kaplan, O. Sasson, U. Inbar, M. Friedlich, M. Fromer, H. Fleischer, E. Portugaly, N. Linial, and M. Linial ProtoNet 4.0: A hierarchical classification of one million protein sequences Nucleic Acids Res., January 1, 2005; 33(suppl_1): D216 - D218. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Vlahovicek, L. Kajan, V. Agoston, and S. Pongor The SBASE domain sequence resource, release 12: prediction of protein domain-architecture using support vector machines Nucleic Acids Res., January 1, 2005; 33(suppl_1): D223 - D225. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Meinel, A. Krause, H. Luz, M. Vingron, and E. Staub The SYSTERS Protein Family Database in 2005 Nucleic Acids Res., January 1, 2005; 33(suppl_1): D226 - D229. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Velankar, P. McNeil, V. Mittard-Runte, A. Suarez, D. Barrell, R. Apweiler, and K. Henrick E-MSD: an integrated data resource for bioinformatics Nucleic Acids Res., January 1, 2005; 33(suppl_1): D262 - D265. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Mi, B. Lazareva-Ulitsky, R. Loo, A. Kejariwal, J. Vandergriff, S. Rabkin, N. Guo, A. Muruganujan, O. Doremieux, M. J. Campbell, et al. The PANTHER database of protein families, subfamilies, functions and pathways Nucleic Acids Res., January 1, 2005; 33(suppl_1): D284 - D288. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Kersey, L. Bower, L. Morris, A. Horne, R. Petryszak, C. Kanz, A. Kanapin, U. Das, K. Michoud, I. Phan, et al. Integr8 and Genome Reviews: integrated views of complete genomes and proteomes Nucleic Acids Res., January 1, 2005; 33(suppl_1): D297 - D302. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Stothard, G. Van Domselaar, S. Shrivastava, A. Guo, B. O'Neill, J. Cruz, M. Ellison, and D. S. Wishart BacMap: an interactive picture atlas of annotated bacterial genomes Nucleic Acids Res., January 1, 2005; 33(suppl_1): D317 - D320. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Wang, Q. Xia, X. He, M. Dai, J. Ruan, J. Chen, G. Yu, H. Yuan, Y. Hu, R. Li, et al. SilkDB: a knowledgebase for silkworm biology and genomics Nucleic Acids Res., January 1, 2005; 33(suppl_1): D399 - D402. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Wang, X. He, J. Ruan, M. Dai, J. Chen, Y. Zhang, Y. Hu, C. Ye, S. Li, L. Cong, et al. ChickVD: a sequence variation database for the chicken genome Nucleic Acids Res., January 1, 2005; 33(suppl_1): D438 - D441. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Hubbard, D. Andrews, M. Caccamo, G. Cameron, Y. Chen, M. Clamp, L. Clarke, G. Coates, T. Cox, F. Cunningham, et al. Ensembl 2005 Nucleic Acids Res., January 1, 2005; 33(suppl_1): D447 - D453. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Parkinson, U. Sarkans, M. Shojatalab, N. Abeygunawardena, S. Contrino, R. Coulson, A. Farne, G. Garcia Lara, E. Holloway, M. Kapushesky, et al. ArrayExpress--a public repository for microarray gene expression data at the EBI Nucleic Acids Res., January 1, 2005; 33(suppl_1): D553 - D555. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rudd openSputnik--a database to ESTablish comparative plant genomics using unsaturated sequence collections Nucleic Acids Res., January 1, 2005; 33(suppl_1): D622 - D627. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Ito, K. Arikawa, B. A. Antonio, I. Ohta, S. Naito, Y. Mukai, A. Shimano, M. Masukawa, M. Shibata, M. Yamamoto, et al. Rice Annotation Database (RAD): a contig-oriented database for map-based rice genomics Nucleic Acids Res., January 1, 2005; 33(suppl_1): D651 - D655. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Gonzales, E. Archuleta, A. Farmer, K. Gajendran, D. Grant, R. Shoemaker, W. D. Beavis, and M. E. Waugh The Legume Information System (LIS): an integrated information resource for comparative legume biology Nucleic Acids Res., January 1, 2005; 33(suppl_1): D660 - D665. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Pinter, D. D. Lent, and N. J. Strausfeld Memory consolidation and gene expression in Periplaneta americana Learn. Mem., January 1, 2005; 12(1): 30 - 38. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Galani, T. A. Nissan, E. Petfalski, D. Tollervey, and E. Hurt Rea1, a Dynein-related Nuclear AAA-ATPase, Is Involved in Late rRNA Processing and Nuclear Export of 60 S Subunits J. Biol. Chem., December 31, 2004; 279(53): 55411 - 55418. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Mashiguchi, I. Yamaguchi, and Y. Suzuki Isolation and Identification of Glycosylphosphatidylinositol-Anchored Arabinogalactan Proteins and Novel {beta}-Glucosyl Yariv-Reactive Proteins from Seeds of Rice (Oryza sativa) Plant Cell Physiol., December 15, 2004; 45(12): 1817 - 1829. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Sauter, K. A. Cornell, S. Beszteri, and G. Rzewuski Functional Analysis of Methylthioribose Kinase Genes in Plants Plant Physiology, December 1, 2004; 136(4): 4061 - 4071. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. McElwee, E. Schuster, E. Blanc, J. H. Thomas, and D. Gems Shared Transcriptional Signature in Caenorhabditis elegans Dauer Larvae and Long-lived daf-2 Mutants Implicates Detoxification System in Longevity Assurance J. Biol. Chem., October 22, 2004; 279(43): 44533 - 44543. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-M. Tsai, H.-C. Wang, J.-H. Leu, H.-H. Hsiao, A. H.-J. Wang, G.-H. Kou, and C.-F. Lo Genomic and Proteomic Analysis of Thirty-Nine Structural Proteins of Shrimp White Spot Syndrome Virus J. Virol., October 15, 2004; 78(20): 11360 - 11370. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


























