Skip Navigation

Nucleic Acids Research 2005 33(Database Issue):D201-D205; doi:10.1093/nar/gki106
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (744K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Mulder, N. J.
Right arrow Articles by Wu, C. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Mulder, N. J.
Right arrow Articles by Wu, C. H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2005, Vol. 33, Database issue D201-D205
© 2005, the authors
Nucleic Acids Research, Vol. 33, Database issue © Oxford University Press 2005; all rights reserved

InterPro, progress and status in 2005

Nicola J. Mulder1,*, Rolf Apweiler1, Teresa K. Attwood3, Amos Bairoch4, Alex Bateman2, David Binns1, Paul Bradley1,3, Peer Bork5, Phillip Bucher6, Lorenzo Cerutti6, Richard Copley7, Emmanuel Courcelle8, Ujjwal Das1, Richard Durbin2, Wolfgang Fleischmann1, Julian Gough9, Daniel Haft10, Nicola Harte1, Nicolas Hulo4, Daniel Kahn8, Alexander Kanapin1, Maria Krestyaninova1, David Lonsdale1, Rodrigo Lopez1, Ivica Letunic5, Martin Madera11, John Maslen1, Jennifer McDowall1, Alex Mitchell1,3, Anastasia N. Nikolskaya12, Sandra Orchard1, Marco Pagni6, Chris P. Ponting13, Emmanuel Quevillon1, Jeremy Selengut10, Christian J. A. Sigrist4, Ville Silventoinen1, David J. Studholme2, Robert Vaughan1 and Cathy H. Wu12

1 EMBL Outstation—European Bioinformatics Institute and 2 Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK, 3 School of Biological Sciences and Department of Computer Science, The University of Manchester, Manchester, UK, 4 Swiss Institute for Bioinformatics, Geneva, Switzerland, 5 Biocomputing Unit EMBL, Heidelberg, Germany, 6 Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland, 7 Wellcome Trust Centre for Human Genetics, Oxford, UK, 8 CNRS/INRA, Toulouse, France, 9 Genomic Sciences Centre, RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Japan, 10 The Institute for Genomic Research, MD, USA, 11 MRC Laboratory of Molecular Biology, Cambridge, UK, 12 Protein Information Resource, Georgetown University Medical Center, Washington, DC, USA and 13 MRC Functional Genetics Unit, Department of Human Anatomy and Genetics, University of Oxford, Oxford, UK

* To whom correspondence should be addressed. Tel: +44 0 1223 494 602; Fax: +44 0 1223 494 468; Email: mulder{at}ebi.ac.uk

Received September 20, 2004; Revised and Accepted October 18, 2004


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES OF INTERPRO
 DISCUSSION
 REFERENCES
 
InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES OF INTERPRO
 DISCUSSION
 REFERENCES
 
The genome sequencing centres are generating raw sequence data at an alarming rate, and the result is a need for automated sequence analysis methods. The automatic analysis of protein sequences is possible through the use of ‘protein signatures’, which are methods for diagnosing a domain or characteristic region of a protein family in a protein sequence. A number of protein signature databases have been developed, each using a variation on the handful of signature methods available, which include patterns, profiles and hidden Markov models (HMMs). These databases are most effective when used together, rather than in isolation. InterPro (1) integrates into one resource the major protein signatures databases: PROSITE (2), which uses regular expressions and profiles, PRINTS (3), which uses position-specific scoring matrix-based (PSSM-based) fingerprints, ProDom (4), which uses automatic sequence clustering, and Pfam (5), SMART (6), TIGRFAMs (7), PIRSF (also known as PIR SuperFamily) (8) and SUPERFAMILY (9), all of which use HMMs.

Signatures from the member databases are integrated manually as they are developed. A team of biologists have this responsibility, as well as that of annotating the new or existing entries. Each InterPro entry is described by one or more signatures, and corresponds to a biologically meaningful family, domain, repeat or site, e.g. post-translational modification (PTM). Not every entry will contain a signature from each member database, only those that correspond to each other are united. Entries are assigned a type to describe what they represent, which may be family, domain, repeat, PTM, active site or binding site. The last two are new entry types, which were introduced to better describe the signatures in some of the entries. Entries may be related to each other through two different relationships: the parent/child and contains/found in relationship. Parent/child relationships are used to describe a common ancestry between entries, whereas the contains/found in relationship generally refers to the presence of genetically mobile domains. InterPro entries are annotated with a name, an abstract, mapping to Gene Ontology (GO) terms and links to specialized databases. InterPro groups all protein sequences matching related signatures into entries. All hits of the protein signatures in InterPro against a composite of the Swiss-Prot and TrEMBL components of UniProt (10) are precomputed. The matches are available for viewing in each InterPro entry in different formats.

The number of entries and coverage of protein space by InterPro is continuing to grow. The beta release of InterPro in 1999 contained 2423 entries, while the latest release of the database contains 11 007 entries, representing nearly a 5-fold increase in 5 years. In its infancy, InterPro covered ~66% of all proteins in Swiss-Prot and TrEMBL, and this has increased to over 90% for Swiss-Prot, 76% for TrEMBL and 78% for UniProt (Swiss-Prot and TrEMBL). A number of new features have been added to the InterPro database since its publication in Nucleic Acids Research in 2003. These include additional protein match views, the InterPro Domain Architectures Viewer, taxonomic range information, additional database links and protein 3D structural information. New members databases that have been integrated are the full-length sequence-based PIRSF database and the structure-based SUPERFAMILY. These are described in more detail below.


    NEW FEATURES OF INTERPRO
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES OF INTERPRO
 DISCUSSION
 REFERENCES
 
Protein match views
For each protein signature, a list of proteins in UniProt that it matches is precomputed. This list gets updated when new proteins enter UniProt or if the signatures themselves change. The match lists may be viewed in a number of different formats including a table view, a detailed view and an overview. There are new options for the ordering of proteins within these views. For example, the views can all be displayed either ordered by Swiss-Prot ID or for only those proteins of known structure. The overview and detailed view can also be ordered by UniProt accession number, and the former can be ordered by taxonomy too. In the overview, clicking on the protein accession number takes the user to the detailed view for that protein. Searching for a protein accession number in the InterPro text search with the ‘find protein matches’ option returns the overview of matches for the protein. Similarly, the detailed view is retrieved through the accession number link (see Figure 1). For the graphical views, a mouse-over displays the actual positions of the matches on the sequence.



View larger version (69K):
[in this window]
[in a new window]
 
Figure 1. Illustration of the detailed view for protein Q06124 [GenBank] , the human protein-tyrosine phosphatase, non-receptor type 11. From an InterPro entry page, clicking on a protein accession number in the ‘Examples’ field takes you to this view for that protein. The oval shapes at the top of the figure display the InterPro Domain Architecture (IDA) view for this protein, which represents its domain composition. Each oval shape contains the domain name and the number of its iterations of the domain if greater than one. The InterPro detailed view represents the protein sequence as a series of different lines for each protein signature hit. The bars are colour coded according to the member database. A separate view below the signature matches displays the structural domains from the SCOP and CATH as white-striped bars. This view provides a complete picture of the protein domain composition and where sequence-based domains correspond to known structures.

 
Where structures are available for proteins, there is a link from the graphical views to the corresponding Protein Data Bank (PDB) structures and a separate line in the display, below the InterPro matches, showing the hyperlinked SCOP (11), CATH (12) and PDB (13) matches on the sequence as white striped bars (see Figure 1). This shows where the protein signatures correspond with structural chains. An Astex icon is available for structures, and clicking on which, loads the AstexViewerTM Java applet page displaying the PDB structure, with the residues included in the CATH or SCOP domain definition highlighted on the PDB chain.

InterPro Domain Architecture viewer
The InterPro Domain Architecture (IDA) viewer is a graphical representation of protein domain architecture, where the domain architecture of a protein sequence is displayed as a series of non-overlapping domains (see Figure 1). These domains are calculated by a method that identifies a subset of InterPro entries/methods, representing non-overlapping domains within proteins. If two domains overlap slightly, their centres are used to order the domains, and domain boundaries are discarded to enhance a comparison of various architectures. If a parent/child hierarchy exists between InterPro domains, matches with the children are represented as those from the parent entry. For each InterPro entry, a graphical representation of unique IDA(s) is provided and each kind of IDA is displayed with an example protein and total number of proteins, sharing this architecture, next to it. Clicking on the count of proteins retrieves all proteins sharing a common architecture. Although domains should not overlap, inserted domains (e.g. nested domains) are still shown in the IDA viewer, as this provides more accurate comparison between IDAs.

Taxonomy viewer
A new feature in InterPro is the ‘Taxonomy’ field, which aims to provide an ‘at a glance’ view of the taxonomic range of the sequences associated with each InterPro entry. This is represented as a circular display with the taxonomy-tree root as its centre. The lineages populating the nodes were selected to provide a view of the major groups of organisms with the model organisms on the outer most circle. Nodes of the taxonomy-tree are placed on the inner circles and radial lines lead to the description for each node. No significance is attached to the position of the node on a particular inner-circle, although some attempt has been made to group nodes. The nodes themselves are either true taxonomy nodes or artificial nodes, of which there are three: ‘Unclassified’, ‘Other Eukaryota (Non-Metazoa)’ and the ‘Plastid Group’. The number of sequences associated with each lineage is displayed, and clicking the number retrieves the graphical overview for proteins within that taxonomic group.

Database cross-references
In addition to cross-referencing the member database signatures and GO (14) terms, there is a separate field in InterPro entries, ‘Database Links’, to provide cross-references to other databases. These included cross-references to corresponding Blocks accession numbers, PROSITE documentation, the CArbohydrate-Active EnZymes (CAZy) website and the Enzyme Commission (EC) Database. New links have been produced to the IUPHAR Receptor Database, the MEROPS Peptidase Database (15) and COMe. The bioinorganic motif database, COMe, is an attempt to classify metalloproteins and some other complex proteins using the concept of bioinorganic motifs.

3D Structural information
A separate field, called ‘Structural links’, provides information on curated structure links. Structural domains from SCOP (11) and CATH (12) are made up of one or more protein chains in a PDB entry. These may include the full chain or region(s) of chain(s). The links to the curated structural domains in this field of InterPro entries are based on the correspondence between the proteins matching the InterPro entry and those proteins of known structure belonging to SCOP or CATH superfamilies. In addition, they include only those links where the structural domains overlap considerably with one or more of the InterPro signatures on the protein sequence. The structural domains are also displayed at the protein level in the graphical views, as described above. Here, all the representative domains at the SCOP/CATH family level are displayed, showing the location of the structural domain(s) in the protein. This enables the user to directly access the SCOP/CATH classification for that particular domain from the protein's detailed graphical view. Mapping between UniProt and PDB entries can be many-to-many, so the ‘Structure’ link displays all the PDB entries associated with that particular protein. The user is able to view the residue-by-residue mapping between UniProt and a PDB chain of interest. This is a useful tool, quite unique in its nature, to show such relationships in a compact way.

New member databases
Two of the newest member databases to join the InterPro Consortium are PIRSF (8) and the structure-based SUPERFAMILY (9). PIRSF is a network classification system that accommodates a flexible number of levels from superfamily to subfamily to reflect varying degrees of sequence conservation. Members of a PIRSF homeomorphic family share full-length sequence similarity with a common domain architecture (homeomorphic) and have common evolutionary origin (monophyletic). PIRSF HMMs are designed to cover the full length of a protein sequence, and thus to include all domains within the sequence. In this way, PIRSF homeomorphic families tend to encompass one or more of the existing InterPro domain entries and show the domain composition of UniProt sequences. Classification based on full-length proteins allows annotation of both generic biochemical and specific biological functions, identification of domain and family relationships, and classification of multidomain proteins. SUPERFAMILY is the first member database that is based solely on structural protein families rather than sequence-based protein families. SUPERFAMILY is a collection of HMMs built from members of SCOP structural superfamilies. This facilitates comparison of protein families based on structure and sequence and adds a new dimension to InterPro entries. Many of the SUPERFAMILY HMMs actually correspond to Pfam HMMs, but also, unsurprisingly, to the structural links in InterPro generated from the SCOP and CATH links to proteins in the PDB.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES OF INTERPRO
 DISCUSSION
 REFERENCES
 
The amalgamation of PROSITE, PRINTS, Pfam, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY into InterPro has provided a useful tool for protein sequence analysis and characterization. InterPro has a number of applications and databases dependent on its continued success. It is the tool of choice for the annotation of new genomes and is used extensively for the automatic annotation of TrEMBL entries. The mappings of InterPro to GO terms (14) provide a means of large-scale mapping of proteins onto GO terms. This accounts for the bulk of the UniProt proteins that are mapped to GO terms. In addition, InterPro is used for the Proteome Analysis Database (16), to provide statistical analyses of whole proteomes for the completely sequenced genomes. For each proteome, the database provides tables of all the InterPro matches ordered by the number of proteins matching the entries, the top 30 InterPro hits, the top 200 hits, the 15 most common families, etc. A tool is also available to perform proteome comparisons between two or more organisms of choice through InterPro analyses.

The InterPro database is available via a webserver (http://www.ebi.ac.uk/interpro) and anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro). The webserver facilitates text and sequence searches, and the FTP site provides regular releases of an XML file for downloading. Future plans for InterPro involve the integration of the next two member databases, CATH HMMs and the PANTHER database. In addition, SWISS-MODEL 3D structure homology models (17) will be displayed in protein graphical views to provide predicted structural information where proteins do not have their structures solved. InterPro is growing along with its member databases, and has increased coverage of the UniProt protein sequence database. The resource continues to expand and provide up-to-date data and new features and thus increases its use to the scientific community as a powerful protein classification tool.


    ACKNOWLEDGEMENTS
 
The InterPro project is supported by the ProFuSe grant (number QLG2-CT-2000-00517) and the Integr8 grant (number QLRI-CT-2001000015) of the European Commission.


    Notes
 
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use permissions, please contact journals.permissions{at}oupjournals.org.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES OF INTERPRO
 DISCUSSION
 REFERENCES
 

  1. Apweiler,R., Attwood,T.K., Bairoch,A., Bateman,A., Birney,E., Biswas,M., Bucher,P., Cerutti,L., Corpet,F., Croning,M.D. et al. ( (2001) ) The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res., , 29, , 37–40.[Abstract/Free Full Text] .

  2. Hulo,N., Sigrist,C.J., Le Saux,V., Langendijk-Genevaux,P.S., Bordoli,L., Gattiker,A., De Castro,E., Bucher,P. and Bairoch,A. ( (2004) ) The PROSITE database, its status in 2002. Nucleic Acids Res., , 30, , 235–238. .

  3. Attwood,T.K., Bradley,P., Flower,D.R., Gaulton,A., Maudling,N., Mitchell,A.L., Moulton,G., Nordle,A., Paine,K., Taylor,P., Uddin,A. and Zygouri,C. ( (2003) ) PRINTS and its automatic supplement pre-PRINTS. Nucleic Acids Res., , 31, , 400–402.[Abstract/Free Full Text] .

  4. Servant,F., Bru,C., Carrere,S., Courcelle,E., Gouzy,J., Peyruc,D. and Kahn,D. ( (2002) ) ProDom: automated clustering of homologous domains. Brief Bioinformatics, , 3, , 246–251.[Abstract/Free Full Text] .

  5. Bateman,A., Coin,L., Durbin,R., Finn,R.D., Hollich,V., Griffiths-Jones,S., Khanna,A., Marshall,M., Moxon,S., Sonnhammer,E.L., Studholme,D.J., Yeats,C. and Eddy,S.R. ( (2004) ) The Pfam protein families database. Nucleic Acids Res., , 32, , 138–141. .

  6. Letunic,I., Copley,R.R., Schmidt,S., Ciccarelli,F.D., Doerks,T., Schultz,J., Ponting,C.P. and Bork,P. ( (2004) ) SMART 4.0: towards genomic data integration. Nucleic Acids Res., , 32, , 142–144. .

  7. Haft,D.H., Selengut,J.D. and White,O. ( (2003) ) The TIGRFAMs database of protein families. Nucleic Acids Res., , 31, , 371–373.[Abstract/Free Full Text] .

  8. Wu,C.H., Nikolskaya,A., Huang,H., Yeh,L.S., Natale,D.A., Vinayaka,C.R., Hu,Z.Z., Mazumder,R., Kumar,S., Kourtesis,P., Ledley,R.S., Suzek,B.E., Arminski,L., Chen,Y., Zhang,J., Cardenas,J.L., Chung,S., Castro-Alvear,J., Dinkov,G. and Barker,W.C. ( (2004) ) PIRSF: family classification system at the Protein Information Resource. Nucleic Acids Res., , 32, , 112–114. .

  9. Madera,M., Vogel,C., Kummerfeld,S.K., Chothia,C. and Gough,J. ( (2004) ) The SUPERFAMILY database in 2004: additions and improvements. Nucleic Acids Res., , 32, , 235–239. .

  10. Apweiler,R., Bairoch,A., Wu,C.H., Barker,W.C., Boeckmann,B., Ferro,S., Gasteiger,E., Huang,H., Lopez,R., Magrane,M., Martin,M.J., Natale,D.A., O'Donovan,C., Redaschi,N. and Yeh,L.S. ( (2004) ) UniProt: the Universal Protein knowledgebase. Nucleic Acids Res., , 32, , 115–119.[Abstract/Free Full Text] .

  11. Andreeva,A., Howorth,D., Brenner,S.E., Hubbard,T.J., Chothia,C. and Murzin,A.G. ( (2004) ) SCOP database in 2004: refinements integrate structure and sequence family. Nucleic Acids Res., , 32, , 226–229. .

  12. Orengo,C.A., Pearl,F.M. and Thornton,J.M. ( (2003) ) The CATH domain structure database. Methods Biochem. Anal., , 44, , 249–271.[Medline] .

  13. Berman,H., Henrick,K. and Nakamura,H. ( (2003) ) Announcing the worldwide Protein Data Bank. Nature Struct. Biol., , 10, , 980.[CrossRef][Web of Science][Medline] .

  14. Harris,M.A., Clark,J., Ireland,A., Lomax,J., Ashburner,M., Foulger,R., Eilbeck,K., Lewis,S., Marshall,B., Mungall,C., Richter,J., Rubin,G.M., Blake,J.A., Bult,C., Dolan,M., Drabkin,H., Eppig,J.T., Hill,D.P., Ni,L., Ringwald,M., Balakrishnan,R., Cherry,J.M., Christie,K.R., Costanzo,M.C., Dwight,S.S., Engel,S., Fisk,D.G., Hirschman,J.E., Hong,E.L., Nash,R.S., Sethuraman,A., Theesfeld,C.L., Botstein,D., Dolinski,K., Feierbach,B., Berardini,T., Mundodi,S., Rhee,S.Y., Apweiler,R., Barrell,D., Camon,E., Dimmer,E., Lee,V., Chisholm,R., Gaudet,P., Kibbe,W., Kishore,R., Schwarz,E.M., Sternberg,P., Gwinn,M., Hannick,L., Wortman,J., Berriman,M., Wood,V., de la Cruz,N., Tonellato,P., Jaiswal,P., Seigfried,T. and White,R. ( (2004) ) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res., , 32, , 258–261. .

  15. Rawlings,N.D., Tolle,D.P. and Barrett,A.J. ( (2004) ) MEROPS: the peptidase database. Nucleic Acids Res., , 32, , 160–164. .

  16. Pruess,M., Fleischmann,W., Kanapin,A., Karavidopoulou,Y., Kersey,P., Kriventseva,E., Mittard,V., Mulder,N., Phan,I., Servant,F. and Apweiler,R. ( (2003) ) The Proteome Analysis database: a tool for the in silico analysis of whole proteomes. Nucleic Acids Res., , 31, , 414–417.[Abstract/Free Full Text] .

  17. Kopp,J. and Schwede,T. ( (2004) ) The SWISS-MODEL Repository of annotated three-dimensional protein structure homology models. Nucleic Acids Res., , 32, , 230–234. .


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
T. Davidsen, E. Beck, A. Ganapathy, R. Montgomery, N. Zafar, Q. Yang, R. Madupu, P. Goetz, K. Galinsky, O. White, et al.
The comprehensive microbial resource
Nucleic Acids Res., November 5, 2009; (2009) gkp912v1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. M. Markowitz, I-M. A. Chen, K. Palaniappan, K. Chu, E. Szeto, Y. Grechkin, A. Ratner, I. Anderson, A. Lykidis, K. Mavromatis, et al.
The integrated microbial genomes system: an expanding comparative analysis resource
Nucleic Acids Res., October 28, 2009; (2009) gkp887v1.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. A. Encinar, G. Fernandez-Ballester, I. E. Sanchez, E. Hurtado-Gomez, F. Stricher, P. Beltrao, and L. Serrano
ADAN: a database for prediction of protein-protein interaction of modular domains mediated by linear motifs
Bioinformatics, September 15, 2009; 25(18): 2418 - 2424.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Benita, H. Kikuchi, A. D. Smith, M. Q. Zhang, D. C. Chung, and R. J. Xavier
An integrative genomics approach identifies Hypoxia Inducible Factor-1 (HIF-1)-target genes that form the core response to hypoxia
Nucleic Acids Res., August 1, 2009; 37(14): 4587 - 4602.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Thongjuea, V. Ruanjaichon, R. Bruskiewich, and A. Vanavichit
RiceGeneThresher: a web-based application for mining genes underlying QTL in rice genome
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D996 - D1000.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Czerwoniec, S. Dunin-Horkawicz, E. Purta, K. H. Kaminska, J. M. Kasprzak, J. M. Bujnicki, H. Grosjean, and K. Rother
MODOMICS: a database of RNA modification pathways. 2008 update
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D118 - D121.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Aurrecoechea, J. Brestelli, B. P. Brunk, J. Dommer, S. Fischer, B. Gajria, X. Gao, A. Gingle, G. Grant, O. S. Harb, et al.
PlasmoDB: a functional genomic database for malaria parasites
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D539 - D543.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. G. Tarcea, T. Weymouth, A. Ade, A. Bookvich, J. Gao, V. Mahavisno, Z. Wright, A. Chapman, M. Jayapandian, A. Ozgur, et al.
Michigan molecular interactions r2: from interacting proteins to pathways
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D642 - D646.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
K. Xia, Z. Fu, L. Hou, and J.-D. J. Han
Impacts of protein-protein interaction domains on organism and network complexity
Genome Res., September 1, 2008; 18(9): 1500 - 1508.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
H. Takarada, M. Sekine, H. Kosugi, Y. Matsuo, T. Fujisawa, S. Omata, E. Kishi, A. Shimizu, N. Tsukatani, S. Tanikawa, et al.
Complete Genome Sequence of the Soil Actinomycete Kocuria rhizophila
J. Bacteriol., June 15, 2008; 190(12): 4139 - 4146.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. N. Wass and M. J. E. Sternberg
ConFunc--functional annotation in the twilight zone
Bioinformatics, March 15, 2008; 24(6): 798 - 806.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
B. Jerg and U. Gerischer
Relevance of nucleotides of the PcaU binding site from Acinetobacter baylyi
Microbiology, March 1, 2008; 154(3): 756 - 766.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
E. S. Rangarajan, A. Asinas, A. Proteau, C. Munger, J. Baardsnes, P. Iannuzzi, A. Matte, and M. Cygler
Structure of [NiFe] Hydrogenase Maturation Protein HypE from Escherichia coli and Its Interaction with HypF
J. Bacteriol., February 15, 2008; 190(4): 1447 - 1458.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. M.C. Robb, E. Ross, and A. S. Alvarado
SmedGD: the Schmidtea mediterranea genome database
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D599 - D606.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Birzele, J. E. Gewehr, and R. Zimmer
AutoPSI: a database for automatic structural classification of protein sequences and structures
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D398 - D401.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. M. Markowitz, E. Szeto, K. Palaniappan, Y. Grechkin, K. Chu, I-M. A. Chen, I. Dubchak, I. Anderson, A. Lykidis, K. Mavromatis, et al.
The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D528 - D533.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Park, B. Park, K. Jung, S. Jang, K. Yu, J. Choi, S. Kong, J. Park, S. Kim, H. Kim, et al.
CFGP: a web-based, comparative fungal genomics platform
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D562 - D571.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. L. Hong, R. Balakrishnan, Q. Dong, K. R. Christie, J. Park, G. Binkley, M. C. Costanzo, S. S. Dwight, S. R. Engel, D. G. Fisk, et al.
Gene Ontology annotations at SGD: new data sources and annotation methods
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D577 - D581.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. G. Conte, S. Gaillard, N. Lanau, M. Rouard, and C. Perin
GreenPhylDB: a database for plant comparative genomics
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D991 - D998.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
G. H. Dean, H. Zheng, J. Tewari, J. Huang, D. S. Young, Y. T. Hwang, T. L. Western, N. C. Carpita, M. C. McCann, S. D. Mansfield, et al.
The Arabidopsis MUM2 Gene Encodes a {beta}-Galactosidase Required for the Production of Seed Coat Mucilage with Correct Hydration Properties
PLANT CELL, December 1, 2007; 19(12): 4007 - 4021.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. A. Rodriguez, T. Bompada, M. Syed, P. K. Shah, and N. Maltsev
Evolutionary analysis of enzymes using Chisel
Bioinformatics, November 15, 2007; 23(22): 2961 - 2968.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
G. Spudich, X. M. Fernandez-Suarez, and E. Birney
Genome browsing with Ensembl: a practical overview
Brief Funct Genomic Proteomic, October 29, 2007; (2007) elm025v1.
[Abstract] [Full Text] [PDF]


Home page
CirculationHome page
R. J.A. Frost and S. Engelhardt
A Secretion Trap Screen in Yeast Identifies Protease Inhibitor 16 as a Novel Antihypertrophic Protein Secreted From the Heart
Circulation, October 16, 2007; 116(16): 1768 - 1775.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
F. Beaussart, J. Weiner 3rd, and E. Bornberg-Bauer
Automated Improvement of Domain ANnotations using context analysis of domain arrangements (AIDAN)
Bioinformatics, July 15, 2007; 23(14): 1834 - 1836.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Al-Shahrour, P. Minguez, J. Tarraga, I. Medina, E. Alloza, D. Montaner, and J. Dopazo
FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W91 - W96.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. A. Marti-Renom, U. Pieper, M. S. Madhusudhan, A. Rossi, N. Eswar, F. P. Davis, F. Al-Shahrour, J. Dopazo, and A. Sali
DBAli tools: mining the protein structure space
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W393 - W397.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. Lee, T. Hong, S. J. Byun, T. Woo, and Y. J. Choi
ESTpass: a web-based server for processing and annotating expressed sequence tag (EST) sequences
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W159 - W162.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Pagni, V. Ioannidis, L. Cerutti, M. Zahn-Zabal, C. V. Jongeneel, J. Hau, O. Martin, D. Kuznetsov, and L. Falquet
MyHits: improvements to an interactive resource for analyzing protein sequences
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W433 - W437.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. A. Romer, G.-R. Kayombya, and E. Fraenkel
WebMOTIFS: automated discovery, filtering and scoring of DNA sequence motifs using multiple programs and Bayesian approaches
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W217 - W220.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
K. M. Hufford, P. Canaran, D. H. Ware, M. D. McMullen, and B. S. Gaut
Patterns of Selection and Tissue-Specific Expression among Maize Domestication and Crop Improvement Loci
Plant Physiology, July 1, 2007; 144(3): 1642 - 1653.
[Abstract] [Full Text] [PDF]


Home page
J Biomol ScreenHome page
M. Sauermann, F. Hahne, C. Schmidt, M. Majety, H. Rosenfelder, S. Bechtel, W. Huber, A. Poustka, D. Arlt, and S. Wiemann
High-Throughput Flow Cytometry-Based Assay to Identify Apoptosis-Inducing Proteins
J Biomol Screen, June 1, 2007; 12(4): 510 - 520.
[Abstract] [PDF]


Home page
Nucleic Acids ResHome page
D. Sulakhe, M. D'Souza, M. Syed, A. Rodriguez, Y. Zhang, E. M. Glass, M. F. Romine, and N. Maltsev
GNARE--a grid-based server for the analysis of user submitted genomes
Nucleic Acids Res., May 25, 2007; (2007) gkm366v1.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
B. Palenik, J. Grimwood, A. Aerts, P. Rouze, A. Salamov, N. Putnam, C. Dupont, R. Jorgensen, E. Derelle, S. Rombauts, et al.
The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation
PNAS, May 1, 2007; 104(18): 7705 - 7710.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
S. Savas, I. W. Taylor, J. L. Wrana, and H. Ozcelik
Functional nonsynonymous single nucleotide polymorphisms from the TGF-{beta} protein interaction network
Physiol Genomics, April 24, 2007; 29(2): 109 - 117.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
N. Salamat-Miller, J. Fang, C. W. Seidel, Y. Assenov, M. Albrecht, and C. R. Middaugh
A Network-based Analysis of Polyanion-binding Proteins Utilizing Human Protein Arrays
J. Biol. Chem., April 6, 2007; 282(14): 10153 - 10163.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
F.-C. Chen, S.-S. Wang, S.-M. Chaw, Y.-T. Huang, and T.-J. Chuang
Plant Gene and Alternatively Spliced Variant Annotator. A Plant Genome Annotation Pipeline for Rice Gene and Alternatively Spliced Variant Identification with Cross-Species Expressed Sequence Tag Conservation from Seven Plant Species
Plant Physiology, March 1, 2007; 143(3): 1086 - 1095.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. J. Mulder, R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, D. Binns, P. Bork, V. Buillard, L. Cerutti, R. Copley, et al.
New developments in the InterPro database
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D224 - D228.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. R. Jefferson, T. P. Walsh, T. J. Roberts, and G. J. Barton
SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein-Protein Interactions
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D580 - D589.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Kulikova, R. Akhtar, P. Aldebert, N. Althorpe, M. Andersson, A. Baldwin, K. Bates, S. Bhattacharyya, L. Bower, P. Browne, et al.
EMBL Nucleotide Sequence Database in 2006
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D16 - D20.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. E. Higgins, M. Claremont, J. E. Major, C. Sander, and A. E. Lash
CancerGenes: a gene selection resource for cancer genome projects
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D721 - D726.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
W. Zhang, Y. Zhang, H. Zheng, C. Zhang, W. Xiong, J. G. Olyarchuk, M. Walker, W. Xu, M. Zhao, S. Zhao, et al.
SynDB: a Synapse protein DataBase based on synapse ontology
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D737 - D741.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Wilson, M. Madera, C. Vogel, C. Chothia, and J. Gough
The SUPERFAMILY database in 2007: families and functions
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D308 - D313.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. Portugaly, N. Linial, and M. Linial
EVEREST: a collection of evolutionary conserved protein domains
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D241 - D246.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Nash, S. Weng, B. Hitz, R. Balakrishnan, K. R. Christie, M. C. Costanzo, S. S. Dwight, S. R. Engel, D. G. Fisk, J. E. Hirschman, et al.
Expanded protein information at SGD: new pages and proteome browser
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D468 - D471.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. M. Smith, J. H. Finger, T. F. Hayamizu, I. J. McCright, J. T. Eppig, J. A. Kadin, J. E. Richardson, and M. Ringwald
The mouse Gene Expression Database (GXD): 2007 update
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D618 - D623.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Jayapandian, A. Chapman, V. G. Tarcea, C. Yu, A. Elkiss, A. Ianni, B. Liu, A. Nandi, C. Santos, P. Andrews, et al.
Michigan Molecular Interactions (MiMI): putting the jigsaw puzzle together
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D566 - D571.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Mi, N. Guo, A. Kejariwal, and P. D. Thomas
PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D247 - D252.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Kutchma, N. Quayum, and J. Jensen
GeneSpeed: protein domain organization of the transcriptome
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D674 - D679.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. E. Ulrich and I. B. Zhulin
MiST: a microbial signal transduction database
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D386 - D390.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Kerrien, Y. Alam-Faruque, B. Aranda, I. Bancarz, A. Bridge, C. Derow, E. Dimmer, M. Feuermann, A. Friedrichsen, R. Huntley, et al.
IntAct--open source resource for molecular interaction data
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D561 - D565.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
F. H. Lessner, B. J. Venters, and K. C. Keiler
Proteolytic Adaptor for Transfer-Messenger RNA-Tagged Proteins from {alpha}-Proteobacteria
J. Bacteriol., January 1, 2007; 189(1): 272 - 275.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
F.-C. Chen, C.-J. Chen, W.-H. Li, and T.-J. Chuang
Human-specific insertions and deletions inferred from mammalian genome sequences
Genome Res., January 1, 2007; 17(1): 16 - 22.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
N. Salamat-Miller, J. Fang, C. W. Seidel, A. M. Smalter, Y. Assenov, M. Albrecht, and C. R. Middaugh
A Network-based Analysis of Polyanion-binding Proteins Utilizing Yeast Protein Arrays
Mol. Cell. Proteomics, December 1, 2006; 5(12): 2263 - 2278.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Huttenhower, M. Hibbs, C. Myers, and O. G. Troyanskaya
A scalable method for integration and functional analysis of multiple microarray datasets
Bioinformatics, December 1, 2006; 22(23): 2890 - 2897.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
D. L. Maeder, I. Anderson, T. S. Brettin, D. C. Bruce, P. Gilna, C. S. Han, A. Lapidus, W. W. Metcalf, E. Saunders, R. Tapia, et al.
The Methanosarcina barkeri Genome: Comparative Analysis with Methanosarcina acetivorans and Methanosarcina mazei Reveals Extensive Rearrangement within Methanosarcinal Genomes
J. Bacteriol., November 15, 2006; 188(22): 7922 - 7931.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. A. George, J. Y. Liu, L. L. Feng, R. J. Bryson-Richardson, D. Fatkin, and M. A. Wouters
Analysis of protein sequence and interaction data for candidate disease gene prediction
Nucleic Acids Res., November 14, 2006; 34(19): e130 - e130.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. E. Vinogradov
'Genome design' model and multicellular complexity: golden middle
Nucleic Acids Res., November 6, 2006; 34(20): 5906 - 5914.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
N. Kaplan and M. Linial
ProtoBee: Hierarchical classification and annotation of the honey bee proteome
Genome Res., November 1, 2006; 16(11): 1431 - 1438.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Espadaler, E. Querol, F. X. Aviles, and B. Oliva
Identification of function-associated loop motifs and application to protein function prediction
Bioinformatics, September 15, 2006; 22(18): 2237 - 2243.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
I. Friedberg
Automated protein function prediction--the genomic challenge
Brief Bioinform, September 1, 2006; 7(3): 225 - 242.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
K. Xie, C. Wu, and L. Xiong
Genomic Organization, Differential Expression, and Interaction of SQUAMOSA Promoter-Binding-Like Transcription Factors and microRNA156 in Rice
Plant Physiology, September 1, 2006; 142(1): 280 - 293.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Hasegawa, S. Fukuda, K. Shimokawa, S. Kondo, N. Maeda, and Y. Hayashizaki
A RecA-mediated exon profiling method
Nucleic Acids Res., August 8, 2006; 34(13): e97 - e97.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
M.-N. Hung, E. Rangarajan, C. Munger, G. Nadeau, T. Sulea, and A. Matte
Crystal Structure of TDP-Fucosamine Acetyltransferase (WecD) from Escherichia coli, an Enzyme Required for Enterobacterial Common Antigen Synthesis.
J. Bacteriol., August 1, 2006; 188(15): 5606 - 5617.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J.-H. Hung, H.-D. Huang, and T.-Y. Lee
ProKware: integrated software for presenting protein structural properties in protein tertiary structures.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W89 - W94.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Yamasaki, H. Kawashima, F. Todokoro, Y. Imamizu, M. Ogawa, M. Tanino, T. Itoh, T. Gojobori, and T. Imanishi
TACT: Transcriptome Auto-annotation Conducting Tool of H-InvDB.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W345 - W349.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. de Castro, C. J. A. Sigrist, A. Gattiker, V. Bulliard, P. S. Langendijk-Genevaux, E. Gasteiger, A. Bairoch, and N. Hulo
ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W362 - W365.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (744K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Mulder, N. J.
Right arrow Articles by Wu, C. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Mulder, N. J.
Right arrow Articles by Wu, C. H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?