Nucleic Acids Research Advance Access published online on November 27, 2007
Nucleic Acids Research, doi:10.1093/nar/gkm895
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Database Issue |
The Universal Protein Resource (UniProt)
The EMBL Outstation, The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven St. NW, Suite 1200, Washington, DC 20007, USA and Swiss Institute of Bioinformatics, Centre Medical Universitaire 1 rue Michel Servet, 1211 Geneva 4, Switzerland
*To whom correspondence should be addressed. Tel: +44 1223 494435; Fax: +44 1223 494468; Email: apweiler{at}ebi.ac.uk
Received September 17, 2007. Accepted October 3, 2007.
| ABSTRACT |
|---|
|
|
|---|
The Universal Protein Resource (UniProt) provides a stable, comprehensive, freely accessible, central resource on protein sequences and functional annotation. The UniProt Consortium is a collaboration between the European Bioinformatics Institute (EBI), the Protein Information Resource (PIR) and the Swiss Institute of Bioinformatics (SIB). The core activities include manual curation of protein sequences assisted by computational analysis, sequence archiving, development of a user-friendly UniProt website, and the provision of additional value-added information through cross-references to other databases. UniProt is comprised of four major components, each optimized for different uses: the UniProt Knowledgebase, the UniProt Reference Clusters, the UniProt Archive and the UniProt Metagenomic and Environmental Sequences database. UniProt is updated and distributed every three weeks, and can be accessed online for searches or download at http://www.uniprot.org.
| INTRODUCTION |
|---|
|
|
|---|
For the rapid and ongoing accumulation of predicted protein sequences by high-throughput genome sequencing for numerous and increasingly diverse organisms, the expansion of large-scale proteomics (e.g. gene expression profiling and protein–protein interactions) and the advent of structural genomics have combined to provide a wealth of data to analyze and use. There is a widely recognized need for a centralized repository of protein sequences with comprehensive coverage and a systematic approach to protein annotation, incorporating, integrating and standardizing data from these various sources.
UniProt is the central resource for storing and interconnecting information from large and disparate sources, and the most comprehensive catalog of protein sequence and functional annotation. It has four components optimized for different uses. The UniProt Knowledgebase (UniProtKB) is an expertly curated database, a central access point for integrated protein information with cross-references to multiple sources. The UniProt Archive (UniParc) is a comprehensive sequence repository, reflecting the history of all protein sequences (1). UniProt Reference Clusters (UniRef) merge closely related sequences based on sequence identity to speed up searches. The UniProt Metagenomic and Environmental Sequences (UniMES) database is a repository specifically developed for the newly expanding area of metagenomic and environmental data. UniProt is built upon the extensive bioinformatics infrastructure and scientific expertise at European Bioinformatics Institute (EBI), Protein Information Resource (PIR) and Swiss Institute of Bioinformatics (SIB). It is freely and easily accessible to researchers.
| CONTENT |
|---|
|
|
|---|
The UniProt Knowledgebase (UniProtKB)
UniProtKB consists of two sections, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL. The former contains manually annotated high quality records with information extracted from literature and curator-evaluated computational analysis. Sequences for which novel functional, structural and/or biochemical data have been published are assigned priority. To achieve accuracy, annotations are performed by biologists with specific expertise. In UniProtKB, annotation consists of the description of the following: function(s), enzyme-specific information, biologically relevant domains and sites, post-translational modifications, subcellular location(s), tissue specificity, developmentally specific expression, structure, interactions, splice isoform(s), diseases associated with deficiencies or abnormalities, etc. Another important part of the annotation process involves the merging of different reports for a single protein. After a careful inspection of the sequences, the annotator selects the reference sequence, does the corresponding merging, and lists the splice and genetic variants along with disease information when available. Any discrepancies between the different sequence sources are also annotated. Cross-references are provided to the underlying nucleotide sequence sources as well as to many other useful databases including organism-specific, domain, family and disease databases. UniProtKB/TrEMBL contains computationally analyzed records enriched with automatic annotation and classification. The computer-assisted annotation is created using automatically generated rules as in Spearmint (2), or manually curated rules based on protein families, including HAMAP family rules (3), RuleBase rules (4) and PIRSF classification-based name rules and site rules (5,6). UniProtKB/TrEMBL contains the translations of all coding sequences (CDS) present in the EMBL/GenBank/DDBJ Nucleotide Sequence Databases, the sequences of PDB structures and data derived from amino acid sequences that are directly submitted to the UniProt Knowledgebase or scanned from the literature. We exclude some types of data such as pseudogenes, small nucleotide fragments, synthetic sequences, most non-germline immunoglobulins and T-cell receptors, most patent sequences, some highly over-represented data and open reading frames (ORFs) which have been wrongly predicted to code for proteins. Records are selected for full manual annotation and integration into UniProtKB/Swiss-Prot according to defined annotation priorities.
The UniProt Reference Clusters (UniRef)
UniRef provides clustered sets of all sequences from the UniProt Knowledgebase (including splice forms as separate entries) and selected UniProt Archive records to obtain complete coverage of sequence space at resolutions of 100%, 90% and 50% identity while hiding redundant sequences (7). The UniRef clusters provide a hierarchical set of sequence clusters where each individual member sequence can exist in only one UniRef cluster at each resolution and have only one parent or child cluster at another resolution. The UniRef100 database combines identical sequences and sub-fragments into a single UniRef entry. UniRef90 is built from UniRef100 clusters and UniRef50 is built from UniRef90 clusters. UniRef100, UniRef90 and UniRef50 yield a database size reduction of
10%, 40% and 70%, respectively. Each cluster record contains source DB, protein name and taxonomy organism information on each member sequences but is represented by a single selected representative protein sequence and name, the number of members and highest common taxonomy node for the membership is included. UniRef100 is the most comprehensive and non-redundant protein sequence dataset available. The reduced size of the UniRef90 and 50 datasets provides for faster sequence similarity searches and reduces the research bias in similarity searches by providing a more even sampling of sequence space. UniRef is currently being used for a broad range of applications in the areas of automated genome annotation, family classification, systems biology, structural genomics, phylogenetic analysis and mass spectrometry (7). The UniRef clusters are updated with every release of UniProtKB.
The UniProt archive (UniParc)
UniParc is the main sequence storehouse and is a comprehensive repository that reflects the history of all protein sequences (1). UniParc houses all new and revised protein sequences from various sources to ensure that complete coverage is available at a single site. It includes not only UniProtKB but also translations from the EMBL-Bank/DDBJ/GenBank Nucleotide Sequence Databases, the Ensembl database of eukaryotic genomes, the H-Invitational Database (H-Inv), the International Protein Index (IPI), the Protein Data Bank (PDB), Protein Research Foundation (PRF), NCBI's Reference Sequence Collection (RefSeq), model organism databases FlyBase, SGD, TAIR Arabidopsis thaliana and WormBase, TROME and protein sequences from the European, American and Japanese Patent Offices. To avoid redundancy, sequences are handled as strings—all sequences 100% identical over the entire length are merged, regardless of source organism. New and updated sequences are loaded on a daily basis, cross-referenced to the source database accession number, and provided with a sequence version that increments upon changes to the underlying sequence. The basic information stored within each UniParc entry is the identifier, the sequence, cyclic redundancy check number, source database(s) with accession and version numbers, and a time stamp. If a UniParc entry does not have a cross-reference to a UniProtKB entry, the reason for the exclusion of that sequence from UniProtKB is provided (e.g. pseudogene). In addition, each source database accession number is tagged with its status in that database, indicating if the sequence still exists or has been deleted in the source database and cross-references to NCBI GI and TaxId if appropriate. UniParc records are designed to be without annotation since the annotation will be only true in the real biological context of the sequence: proteins with the same sequence may have different functions depending on species, tissue, developmental stage, etc.
The UniProt Metagenomic and Environmental Sequences database (UniMES)
The UniProt Knowledgebase contains entries with a known taxonomic source. A new development in sequence production—namely, the availability of metagenomic data—has necessitated the creation of a separate database, UniProt Metagenomic and Environmental Sequences database (UniMES). Metagenomics is the large-scale genomic analysis of microbes recovered from environmental samples as opposed to laboratory-grown organisms, which represent only a small proportion of the microbial world. UniMES currently contains data from the Global Ocean Sampling Expedition (GOS) (8) which was originally submitted to the International Nucleotide Sequence Databases (INSDC) (9). The initial GOS dataset is composed of 25 million DNA sequences primarily from oceanic microbes and predicts nearly 6 million proteins. By combining the predicted protein sequences with automatic classification by InterPro, the integrated resource for protein families, domains and functional sites, UniMES uniquely provides free access to the array of genomic information gathered from the sampling expeditions, enhanced by links to further analytical resources. The environmental sample data contained within this database is not present in the UniProt Knowledgebase or the UniProt Reference Clusters but is integrated into UniParc. UniMES is available on the ftp site in FASTA format with a UniMES matches to InterPro methods file.
| NEW DEVELOPMENTS |
|---|
|
|
|---|
Recent format changes
- Introduction of the new line type PE (Protein Existence)
Most protein sequences are derived from translations of gene predictions. Some of them exhibit strong sequence similarity to known proteins in closely related species. For other proteins, there is experimental evidence such as Edman sequencing, clear identification by mass spectrometry (MSI), X-ray or NMR structure, detection by antibodies, etc. To indicate these different levels of evidence for the existence of a protein, we have introduced the PE (Protein Existence) line to all UniProtKB entries. The criteria for assigning a particular PE level is described in the document pe_criteria.txt, available both by ftp and on the website. Note that the PE line does not describe the accuracy or correctness of a sequence displayed in UniProtKB but the evidence for the existence of a protein. It may happen that the protein sequence is not entirely accurate, especially for sequences derived from gene predictions from genomic sequences.
The PE line appears between the DR and KW lines of the UniProtKB entries and the format is:
PE Level: Evidence;
With the following values:
- 1:Evidence at protein level;
- 2:Evidence at transcript level;
- 3:Inferred from homology;
- 4:Predicted;
- 5:Uncertain;
- (ii) The format of the ID line was changed to better reflect the annotation status of an entry. The STANDARD and PRELIMINARY data classes were replaced by Reviewed (entries that have been manually reviewed and annotated by UniProtKB curators) and Unreviewed (computer-annotated entries that have not been reviewed by UniProtKB curators), respectively.
- (iii) The feature key INIT_MET is used to indicate that the initiator methionine has been cleaved off. Previously, the initiator methionine was not included in the sequence of a UniProtKB entry in such a case and the INIT_MET sequence coordinates were therefore 0. The initiator methionine has now been added back to such protein sequences and the sequence coordinates of the feature key INIT_MET accordingly changed to 1.
- (iv) A new CC line topic SEQUENCE CAUTION was introduced to specifically describe reported sequences that differ substantially from that which is displayed, and where the underlying cause of this difference cannot be clearly described in FT CONFLICT lines. Typical examples of such sequence discrepancies may include frameshifts, erroneous gene model predictions and the presence of contaminating vector sequence or sequence of unknown origin. This type of information was previously reported in the CC line topic CAUTION, together with other types of warnings that are unrelated to sequence differences between the submitted sequences contained in the entry.
- (v) Evidence tags in UniProtKB XML
- (iii) The feature key INIT_MET is used to indicate that the initiator methionine has been cleaved off. Previously, the initiator methionine was not included in the sequence of a UniProtKB entry in such a case and the INIT_MET sequence coordinates were therefore 0. The initiator methionine has now been added back to such protein sequences and the sequence coordinates of the feature key INIT_MET accordingly changed to 1.
- (ii) The format of the ID line was changed to better reflect the annotation status of an entry. The STANDARD and PRELIMINARY data classes were replaced by Reviewed (entries that have been manually reviewed and annotated by UniProtKB curators) and Unreviewed (computer-annotated entries that have not been reviewed by UniProtKB curators), respectively.
In UniProtKB/TrEMBL, the evidence attribute and the evidence element are used to indicate the source of an annotation. This has now been extended to UniProtKB/Swiss-Prot. In the initial phase, automatic procedures are used to infer the evidence from the existing data (mainly the contents of the scope element). It will gradually also become part of the manual curation process. The completion of the retrofit of existing UniProtKB/Swiss-Prot with evidence information will be an ongoing process. The evidences are also visible on the UniProt website.
Forthcoming format change
The protein names contained in the description (DE) lines of reviewed UniProtKB entries are widely used by scientists to unambiguously identify a protein and provide a definitive source nomenclature for the annotation of homologous proteins in new genomic sequences. It is hence essential to provide description lines from which the recommended name(s) of a given protein, and all known synonyms, may be easily identified and extracted. To achieve this, the DE line format has been revised to clearly distinguish the recommended name of the protein as well as any commonly used synonyms, abbreviations, EC numbers for proteins with enzymatic activity, and to indicate obsolete or erroneous names.
This also allowed UniProt to undertake a major clean up of the content of these lines and to implement strict protein-naming guidelines. Whenever possible, the official nomenclature defined by the appropriate expert international committees are used, although UniProt also attempts to create standard nomenclature for proteins where no such recommendations currently exist and to establish naming conventions that can be consistently applied across the largest spectrum of species as possible. The guidelines are described in the document nameprot.txt, available both by ftp and on the website. In addition, the subset of these guidelines pertinent to microbial organisms has been ratified for use by the American Society for Microbiology, and discussions with the Joint Commission on Biochemical Nomenclature of IUPAC and IUBMB (JCBN) are underway for likely endorsement as an official protein nomenclature document.
Previous format:
DE GDP-fucose protein O-fucosyltransferase 1 precursor (EC 2.4.1.221)
DE (Peptide-O-fucosyltransferase) (O-fucosyltransferase 1) (O-FucT-1)
DE (Neurotic protein).
New format:
DE RecName: Full=GDP-fucose protein O-fucosyltransferase 1;
DE EC=2.4.1.221;
DE AltName: Full=Peptide-O-fucosyltransferase;
DE AltName: Full=O-fucosyltransferase 1;
DE Short=O-FucT-1;
DE AltName: Full=Neurotic protein;
DE Flags: Precursor;
UniProtKB cross-references
The UniProt Knowledgebase cross-references 118 external databases, 92 with explicit links and 26 via implicit links. These resources provide additional or complementary information to what is available at UniProt and can be valuable for biological discovery. Further documentation is present in dbxref.txt, available both on ftp and the website. Recently, a collaboration between UniProt and National Center for Biotechnology Information (NCBI) teams was established to provide bi-directional cross-references between Entrez Gene/RefSeq and UniProtKB and this was achieved in our databases in September 2007.
ID Mapping enhancements
UniProt provides a mapping service to convert common gene IDs and protein IDs to UniProtKB AC/ID and vice versa. Mappings are either inherited from cross-references within UniProtKB entries, or make use of cross-references obtained from the iProClass database (10). This service is available at http://www.uniprot.org/search/idmapping.shtml, where users can map between UniProtKB and about 100 other data sources, such as NCBI (e.g. gi numbers, RefSeq accession numbers, Entrez Gene IDs, PubMed IDs), GO (www.geneontology.org/), PFAM (www.sanger.ac.uk/Software/Pfam/) and PIRSF (pir.georgetown.edu/pirsf.shtml). Users can enter a set of IDs or the name of an ID file to retrieve the mappings. A HTTP client is also provided for programmatic access. In addition, users can download selected mappings in the form of a tab-delimited table. To facilitate the large-scale proteomic and gene expression data analysis, the ID mapping services will be further enhanced to: (i) provide downloadable files for mappings between UniProtKB AC/IDs and commonly used IDs (such as a NCBI gi number); (ii) provide downloadable mapping files for the most commonly used organisms (such as model organisms); (iii) provide a SOAP web services for programmatic access.
Enhancement of bibliography information
The UniProtKB bibliography information provides protein entries with additional and up-to-date curated literature from several other sources including GeneRIF, SGD, MGI and more recently GAD (Genetic Association Database) (11). New sources of curated literature information continue to be added to the protein bibliography from model organism databases (MOD) such as ZFIN, RGD, Wormbase, dictyBase and Flybase. The bibliography information provides not only the source attribution of each reference, but also succinct information about the reference annotated by the source database. The bibliography information is available via the website.
UniProt website developments
The UniProt website (http://www.uniprot.org/) has been thoroughly redeveloped and the individual mirrors (http://www.ebi.uniprot.org, http://www.expasy.uniprot.org, http://www.pir.uniprot.org and parts of http://www.expasy.org) are no longer maintained. User feedback and the analysis of the use of our previous sites has led us to put more emphasis on supporting the most frequently used functionalities: Database searches with simple (and sometimes less simple) queries that often consist of only a few terms have been enhanced by a good scoring system and a suggestion mechanism. Searching with ontology terms is assisted by auto-completion, and there is also the possibility of using ontologies to browse search results. The viewing of database entries was improved with configurable views, a simplified terminology and a better integration of documentation. Medium-to-large sized result sets can now be retrieved directly on the site, so people no longer need to be referred to commercial, third party services. We have also simplified access to the most common bioinformatics tools: sequence similarity searches, multiple sequence alignments, batch retrieval and a database identifier mapping tool can now be launched directly from any page, and the output of these tools can be combined, filtered and browsed like normal database searches. Programmatic access to all data and results is possible via simple HTTP (REST) requests (http://www.uniprot.org/help/technical). In addition to the existing formats that we support for our different data sets (e.g. plain text, FASTA and XML for UniProtKB), we now also provide (configurable) tab-delimited, RSS and GFF downloads where possible, and all data is available in RDF (http://www.w3.org/RDF/), a W3C standard for publishing data on the Semantic Web.
Annotation developments
Subcellular location annotation
In order to facilitate the standardization of the CC line (Comment) topic SUBCELLULAR LOCATION, the free text content of these lines was analyzed. It was decided to describe concepts for the location, the topology and the orientation with respect to a membrane, with a controlled vocabulary. This controlled vocabulary is described in the new document subcell.txt, available both by ftp and on the website. For a given concept, the preferred term for the controlled vocabulary is provided with a precise definition, its synonyms and other relevant information. To allow optimal use of this controlled vocabulary, the format of the SUBCELLULAR LOCATION comment has been modified. The line starts with the controlled vocabulary(ies) optionally followed by a Note= containing additional relevant information. If there are multiple isoforms/peptides/chains for which location information is available, they are described in distinct SUBCELLULAR LOCATION comments. An integral part of developing this controlled vocabulary was a collaboration with the Gene Ontology Annotation Database (GOA) (12) to ensure mapping of our terminologies. This collaboration is ongoing to ensure synchronization and the further development of the controlled vocabularies. In September 2007, subcell.txt contains 314 location, 10 topology and 8 orientation terms and there are 158596 SUBCELLULAR LOCATION comments in UniProtKB/Swiss-Prot with 3329 specific annotations for isoforms/peptides/chains.
Virus annotation
For the last three years, a special effort has been ongoing in UniProtKB/Swiss-Prot to annotate viruses in the framework of the new virus annotation program. Viruses are highly specialized organisms that often display unusual molecular functions. Their diversity is enormous with more than 73 families, each having a different replication cycle. Because of this, it became an annotation priority to integrate specific viral information into UniProtKB/Swiss-Prot. The initial focus is on important human pathogens such as HIV, Influenza, Hepatitis C, Rabies, SARS, Ebola, Dengue and Yellow fever viruses.
For each family, the taxonomy is standardized and updated according to recent publications and the International Committee for Taxonomy of Viruses (ICTV) guidelines. There are a huge number of sequences available for some viruses like HIV and Hepatitis C (HIV is the organism with the most entries in the EMBL-Bank/DDBJ/GenBank nucleotide sequence databases). For these viruses, representative strains of each subtype or genotype are chosen for annotation in order to cover the whole diversity. Annotation for a well-studied strain (often called: type species) can then be propagated as appropriate to each of the related strains or isolates. Since all viral proteins are synthesized within an infected host, a new line was introduced in viral entries to display the host organism(s). Virus–host proteins interactions are critical for virus replication. These interactions are now annotated in the virus protein entries as well as in the concerned host entries, and a specific keyword has been created (Host–virus interaction) in order to further facilitate access to this information.
Annotation propagation using PIRSF-based manually curated rules
PIRSF classification-based manually curated rules are used for annotation propagation (5,6). Specifically, family-specific site rules are created for annotating structural features such as active sites, binding sites, modified residues and other functionally important residues. This information is then propagated to the rest of the members of a given PIRSF based on a structure-guided multiple alignment and profile HMM created for each PIRSF. Only residues that satisfy the rule criteria are ultimately propagated to entries within the PIRSF that lack an experimentally derived structure. For curation of ligand-binding sites, a systematic ligand-centric approach is being followed. This involves selecting a biologically relevant ligand and then mapping all the structures bound to this ligand from the PDB on to PIRSFs. The liganded structure within each PIRSF serves as a template for curation of the binding sites. For example, all available structures bound to the ligand S-adenosyl-L-methionine are mapped to about 90 PIRSFs. Site rules have been created for each of these families and the information is being integrated into UniProtKB records. This systematic approach will enable proper naming and error-free propagation of these sites. This will eventually cover the ligand space and will enable the identification of conserved motifs and patterns.
| DATABASE ACCESS AND FEEDBACK |
|---|
|
|
|---|
UniProt is freely available for both commercial and non-commercial use. Please see http://www.uniprot.org/terms for details. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every three weeks except for UniMES, which is updated only when the underlying source data are updated. Statistics are available with each release at www.uniprot.org.We are constantly trying to improve our database in terms of accuracy and representation and hence, consider your feedback extremely valuable. Please contact us if you have any questions (http://www.uniprot.org/support/helpdesk.shtml) or comments (http://www.uniprot.org/support/feedback.shtml). You can also subscribe to e-mail alerts (http://www.uniprot.org/support/alerts.shtml) for the latest information on UniProt databases.
| APPENDIX |
|---|
|
|
|---|
UniProt has been prepared by:
Amos Bairoch, Lydie Bougueleret, Severine Altairac, Valeria Amendolia, Andrea Auchincloss, Ghislaine Argoud Puy, Kristian Axelsen, Delphine Baratin, Marie-Claude Blatter, Brigitte Boeckmann, Laurent Bollondi, Emmanuel Boutet, Silvia Braconi Quintaje, Lionel Breuza, Alan Bridge, Virginie Bulliard-Le Saux, Edouard deCastro, Luciane Ciampina, Danielle Coral, Elisabeth Coudert, Isabelle Cusin, Fabrice David, Gwennaelle Delbard, Dolnide Dornevil, Paula Duek-Roggli, Severine Duvaud, Anne Estreicher, Livia Famiglietti, Nathalie Farriol-Mathis, Serenella Ferro, Marc Feuermann, Elisabeth Gasteiger, Alain Gateau, Sebastian Gehant, Vivienne Gerritsen, Arnaud Gos, Nadine Gruaz-Gumowski, Ursula Hinz, Chantal Hulo, Nicolas Hulo, Alessandro Innocenti, Janet James, Eric Jain, Silvia Jimenez, Florence Jungo, Vivien Junker, Guillaume Keller, Corinne Lachaize, Lydie Lane-Guermonprez, Petra Langendijk-Genevaux, Vicente Lara, Philippe Le Mercier, Damien Lieberherr, Tania de Oliveira Lima, Veronique Mangold, Xavier Martin, Karine Michoud, Madelaine Moinat, Anne Morgat, Marisa Nicolas, Salvo Paesano, Ivo Pedruzzi, David Perret, Isabelle Phan, Sandrine Pilbout, Violaine Pillet, Sylvain Poux, Monica Pozzato, Nicole Redaschi, Sorogini Reynaud, Catherine Rivoire, Bernd Roechert, Claudia Sapsezian, Michel Schneider, Christian Sigrist, Karin Sonesson, Sylvie Staehli, Andre Stutz, Shyamala Sundaram, Michael Tognolli, Laure Verbregue, Anne-Lise Veuthey, Claudia Vitorello, Lina Yip and Luiz Fernando Zuletta at the Swiss Institute of Bioinformatics (SIB) and the Medical Biochemistry Department of the University of Geneva.
Rolf Apweiler, Yasmin Alam-Faruque, Daniel Barrell, Lawrence Bower, Paul Browne, Wei Mun Chan, Louise Daugherty, Emilio Salazar Donate, Ruth Eberhardt, Alexander Fedotov, Rebecca Foulger, Gabriella Frigerio, John Garavelli, Renato Golin, Alan Horne, Julius Jacobsen, Michael Kleen, Paul Kersey, Kati Laiho, Duncan Legge, Michele Magrane, Maria Jesus Martin, Patricia Monteiro, Claire O'D;onovan, Sandra Orchard, John O'R;ourke, Samuel Patient, Manuela Pruess, Andrey Sitnov, Eleanor Whitfield, Daniela Wieser, Quan Lin, Mark Rynbeek, Giuseppe di Martino, Mike Donnelly and Pieter van Rensburg at the European Bioinformatics Institute (EBI).
Cathy Wu, Cecilia Arighi, Leslie Arminski, Winona Barker, Yongxing Chen, Daniel Crooks, Zhang-Zhi Hu, Hsing-Kuo Hua, Hongzhan Huang, Robel Kahsay, Raja Mazumder, Peter McGarvey, Darren Natale, Anastasia N. Nikolskaya, Natalia Petrova, Baris Suzek, Sona Vasudevan, C. R. Vinayaka, Lai Su Yeh, and Jian Zhang at the Protein Information Resource (PIR).
| ACKNOWLEDGEMENTS |
|---|
UniProt is mainly supported by the National Institutes of Health (NIH) grant 2 U01 HG02712-05. Additional support for the EBI's involvement in UniProt comes from the European Commission contract FELICS (021902RII3) and from the NIH grant 2 P41 HG002273-07. UniProtKB/Swiss-Prot activities at the SIB are supported by the Swiss Federal Government through the Federal Office of Education and Science, by the European Commission contract FELICS (021902RII3) and by the NIH/NIAID (HHSN 2662040035C ADB contract number N01-AI-40035). PIR activities are also supported by the NIH grants and contracts HHSN266200400061C, NCI-caBIG and 1R01GM080646-01, and the National Science Foundation (NSF) grant IIS-0430743. Funding to pay the Open Access publication charges for this article was provided by the National Institutes of Health (NIH) (grant 2 U01 HG02712-05).
Conflict of interest statement. None declared.
| REFERENCES |
|---|
|
|
|---|
- Leinonen R, Diez FG, Binns D, Fleischmann W, Lopez R, Apweiler R. UniProt archive. Bioinformatics (2004) 20:3236–3237.
[Abstract/Free Full Text] - Wieser D, Kretschmann E, Apweiler R. Filtering erroneous protein annotation. Bioinformatics (2004) 20:i342–i347.[Abstract]
- Gattiker A, Michoud K, Rivoire C, Auchincloss AH, Coudert E, Lima T, Kersey P, Pagni M, Sigrist CJ, et al. Automated annotation of microbial proteomes in SWISS-PROT. Comput. Biol. Chem. (2004) 27:49–58.[CrossRef][Web of Science]
- Fleischmann W, Moller S, Gateau A, Apweiler R. A novel method for automatic functional annotation of proteins. Bioinformatics (1999) 15:228–233.
[Abstract/Free Full Text] - Wu CH, Nikolskaya A, Huang H, Yeh LS, Natale DA, Vinayaka CR, Hu ZZ, Mazumder R, Kumar S, et al. PIRSF: family classification system at the Protein Information Resource. Nucleic Acids Res. (2004) 32:D112–D114.
[Abstract/Free Full Text] - Natale DA, Vinayaka CR, Wu CH. Large-scale, classification-driven, rule-based functional annotation of proteins. In: Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics—Subramaniam S, ed. (2005) West Sussex, England: John Wiley & Sons, Ltd. West Sussex, England. Vol. 7, pp. 2993–3004.
- Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH. UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics (2007) 23:1282–1288.
[Abstract/Free Full Text] - Yooseph S, Sutton G, Rusch DB, Halpern AL, Williamson SJ, Remington K, Eisen JA, Heidelberg KB, Manning G, et al. The Sorcerer II Global Ocean Sampling Expedition: expanding the universe of protein families. PLoS Biol. (2007) 5:e16.[CrossRef][Medline]
- Brunak S, Danchin A, Hattori M, Nakamura H, Shinozaki K, Matise T, Preuss D. Nucleotide Sequence Database Policies. Science (2002) 298:1333.[Web of Science][Medline]
- Wu CH, Huang H, Nikolskaya A, Hu Z, Barker WC, The iProClass integrated database for protein functional analysis. Comput. Biol. Chem (2004) 28:87–96.[CrossRef][Web of Science][Medline]
- Becker KG, Barnes KC, Bright TJ, Wang SA. The genetic association database. Nat. Genet. (2004) 36:431–432.[CrossRef][Web of Science][Medline]
- Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R, et al. The Gene Ontology Annotation (GOA) Database: sharing knowledge in UniProt with Gene Ontology. Nucleic Acids Res. (2004) 32:D262–D266.
[Abstract/Free Full Text]
This article has been cited by other articles:
![]() |
K. Yura and S. Hayward The interwinding nature of protein-protein interfaces and its implication for protein complex formation Bioinformatics, December 1, 2009; 25(23): 3108 - 3113. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. P. Albaum, H. Neuweger, B. Franzel, S. Lange, D. Mertens, C. Trotschel, D. Wolters, J. Kalinowski, T. W. Nattkemper, and A. Goesmann Qupe--a Rich Internet Application to take a step forward in the analysis of mass spectrometry-based quantitative proteomics experiments Bioinformatics, December 1, 2009; 25(23): 3128 - 3134. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu, J. R. Faeder, and C. J. Camacho Toward a quantitative theory of intrinsically disordered proteins and their function PNAS, November 24, 2009; 106(47): 19819 - 19823. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. L. Lister, M. Pocock, M. Taschuk, and A. Wipat Saint: a lightweight integration environment for model annotation Bioinformatics, November 15, 2009; 25(22): 3026 - 3027. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Bartoli, P. Fariselli, A. Krogh, and R. Casadio CCHMM_PROF: a HMM-based coiled-coil predictor with evolutionary information Bioinformatics, November 1, 2009; 25(21): 2757 - 2763. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Hernandez, O. Valdes-Lopez, M. Ramirez, N. Goffard, G. Weiller, R. Aparicio-Fabre, S. I. Fuentes, A. Erban, J. Kopka, M. K. Udvardi, et al. Global Changes in the Transcript and Metabolic Profiles during Symbiotic Nitrogen Fixation in Phosphorus-Stressed Common Bean Plants Plant Physiology, November 1, 2009; 151(3): 1221 - 1238. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gu, Y. Wang, and T. Lilburn A Comparative Genomics, Network-Based Approach to Understanding Virulence in Vibrio cholerae J. Bacteriol., October 15, 2009; 191(20): 6262 - 6272. [Abstract] [Full Text] [PDF] |
||||
![]() |
K Kanai, S Yoshida, S Hirose, H Oguni, S Kuwabara, S Sawai, A Hiraga, G Fukuma, H Iwasa, T Kojima, et al. Physicochemical property changes of amino acid residues that accompany missense mutations in SCN1A affect epilepsy phenotype severity J. Med. Genet., October 1, 2009; 46(10): 671 - 679. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. P. Huntley, D. Binns, E. Dimmer, D. Barrell, C. O'Donovan, and R. Apweiler QuickGO: a user tutorial for the web-based Gene Ontology browser Database, September 30, 2009; 2009(0): bap010 - bap010. [Abstract] [Full Text] [PDF] |
||||
![]() |
J Lartey and A Lopez Bernal RHO protein regulation of contraction in the human uterus Reproduction, September 1, 2009; 138(3): 407 - 424. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Swainston and P. Mendes libAnnotationSBML: a library for exploiting SBML annotations Bioinformatics, September 1, 2009; 25(17): 2292 - 2293. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Choudhary, C. Kumar, F. Gnad, M. L. Nielsen, M. Rehman, T. C. Walther, J. V. Olsen, and M. Mann Lysine Acetylation Targets Protein Complexes and Co-Regulates Major Cellular Functions Science, August 14, 2009; 325(5942): 834 - 840. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Adelson, J. M. Raison, and R. C. Edgar Characterization and distribution of retrotransposons and simple sequence repeats in the bovine genome PNAS, August 4, 2009; 106(31): 12855 - 12860. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Heger, C. P. Ponting, and I. Holmes Accurate Estimation of Gene Evolutionary Rates Using XRATE, with an Application to Transmembrane Proteins Mol. Biol. Evol., August 1, 2009; 26(8): 1715 - 1721. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. L. Yu, T. M. Louie, R. Summers, Y. Kale, S. Gopishetty, and M. Subramanian Two Distinct Pathways for Metabolism of Theophylline and Caffeine Are Coexpressed in Pseudomonas putida CBB5 J. Bacteriol., July 15, 2009; 191(14): 4624 - 4632. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Antezana, M. Kuiper, and V. Mironov Biological knowledge management: the emerging role of the Semantic Web technologies Brief Bioinform, July 1, 2009; 10(4): 392 - 407. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Haider, B. Ballester, D. Smedley, J. Zhang, P. Rice, and A. Kasprzyk BioMart Central Portal--unified access to biological data Nucleic Acids Res., July 1, 2009; 37(suppl_2): W23 - W27. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Camps, O. Carrillo, A. Emperador, L. Orellana, A. Hospital, M. Rueda, D. Cicin-Sain, M. D'Abramo, J. L. Gelpi, and M. Orozco FlexServ: an integrated tool for the analysis of protein flexibility Bioinformatics, July 1, 2009; 25(13): 1709 - 1710. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Goffard, T. Frickey, and G. Weiller PathExpress update: the enzyme neighbourhood method of associating gene-expression data with metabolic pathways Nucleic Acids Res., July 1, 2009; 37(suppl_2): W335 - W339. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Yachie, R. Saito, J. Sugahara, M. Tomita, and Y. Ishihama In Silico Analysis of Phosphoproteome Data Suggests a Rich-get-richer Process of Phosphosite Accumulation over Evolution Mol. Cell. Proteomics, May 1, 2009; 8(5): 1061 - 1071. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Talavera, R. A. Laskowski, and J. M. Thornton WSsas: a web service for the annotation of functional residues through structural homologues Bioinformatics, May 1, 2009; 25(9): 1192 - 1194. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Neelakanta, T. S. Sankar, and K. Schnetz Characterization of a {beta}-Glucoside Operon (bgc) Prevalent in Septicemic and Uropathogenic Escherichia coli Strains Appl. Envir. Microbiol., April 15, 2009; 75(8): 2284 - 2293. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Dumontier and N. Villanueva-Rosales Towards pharmacogenomics knowledge discovery with the semantic web Brief Bioinform, March 1, 2009; 10(2): 153 - 163. [Abstract] [Full Text] [PDF] |
||||
![]() |
X.-w. Chen and J. C. Jeong Sequence-based prediction of protein interaction sites with an integrative method Bioinformatics, March 1, 2009; 25(5): 585 - 591. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A Reeves, D. Talavera, and J. M Thornton Genome and proteome annotation: organization, interpretation and integration J R Soc Interface, February 6, 2009; 6(31): 129 - 147. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Schwartz, M. F. Chou, and G. M. Church Predicting Protein Post-translational Modifications Using Meta-analysis of Proteome Scale Data Sets Mol. Cell. Proteomics, February 1, 2009; 8(2): 365 - 379. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Pabuwal and Z. Li Comparative analysis of the packing topology of structurally important residues in helical membrane and soluble proteins Protein Eng. Des. Sel., February 1, 2009; 22(2): 67 - 73. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. Huang, B. T. Sherman, and R. A. Lempicki Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists Nucleic Acids Res., January 1, 2009; 37(1): 1 - 13. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Diella, S. Chabanis, K. Luck, C. Chica, C. Ramu, C. Nerlov, and T. J. Gibson KEPE--a motif frequently superimposed on sumoylation sites in metazoan chromatin proteins and transcription factors Bioinformatics, January 1, 2009; 25(1): 1 - 5. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Zhang, C. Lundegaard, and M. Nielsen Pan-specific MHC class I predictors: a benchmark of HLA class I pan-specific prediction methods Bioinformatics, January 1, 2009; 25(1): 83 - 89. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Waegele, I. Dunger-Kaltenbach, G. Fobo, C. Montrone, H.-W. Mewes, and A. Ruepp CRONOS: the cross-reference navigation server Bioinformatics, January 1, 2009; 25(1): 141 - 143. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. E. Newburger and M. L. Bulyk UniPROBE: an online database of protein binding microarray data on protein-DNA interactions Nucleic Acids Res., January 1, 2009; 37(suppl_1): D77 - D82. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Cochrane, R. Akhtar, J. Bonfield, L. Bower, F. Demiralp, N. Faruque, R. Gibson, G. Hoad, T. Hubbard, C. Hunter, et al. Petabyte-scale innovations at the European Nucleotide Archive Nucleic Acids Res., January 1, 2009; 37(suppl_1): D19 - D25. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. P. Davis, C. G. Murphy, C. A. Saraceni-Richards, M. C. Rosenstein, T. C. Wiegers, and C. J. Mattingly Comparative Toxicogenomics Database: a knowledgebase and discovery tool for chemical-gene-disease networks Nucleic Acids Res., January 1, 2009; 37(suppl_1): D786 - D792. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Liu and M. Pop ARDB--Antibiotic Resistance Genes Database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D443 - D447. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Higurashi, T. Ishida, and K. Kinoshita PiSite: a database of protein interaction sites using multiple binding states in the PDB Nucleic Acids Res., January 1, 2009; 37(suppl_1): D360 - D364. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. G. Almeida, N. J. Sakabe, A. R. deOliveira, M. C. C. Silva, A. S. Mundstein, T. Cohen, Y.-T. Chen, R. Chua, S. Gurung, S. Gnjatic, et al. CTdatabase: a knowledge-base of high-throughput and curated data on cancer-testis antigens Nucleic Acids Res., January 1, 2009; 37(suppl_1): D816 - D819. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Lima, A. H. Auchincloss, E. Coudert, G. Keller, K. Michoud, C. Rivoire, V. Bulliard, E. de Castro, C. Lachaize, D. Baratin, et al. HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot Nucleic Acids Res., January 1, 2009; 37(suppl_1): D471 - D478. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Klein, R. Munch, I. Biegler, I. Haddad, I. Retter, and D. Jahn Strepto-DB, a database for comparative genomics of group A (GAS) and B (GBS) streptococci, implemented with the novel database platform 'Open Genome Resource' (OGeR) Nucleic Acids Res., January 1, 2009; 37(suppl_1): D494 - D498. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Kamburov, C. Wierling, H. Lehrach, and R. Herwig ConsensusPathDB--a database for integrating human functional interaction networks Nucleic Acids Res., January 1, 2009; 37(suppl_1): D623 - D628. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Hunter, R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, D. Binns, P. Bork, U. Das, L. Daugherty, L. Duquenne, et al. InterPro: the integrative protein signature database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D211 - D215. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Gnad, M. Oroshi, E. Birney, and M. Mann MAPU 2.0: high-accuracy proteomes mapped to genomes Nucleic Acids Res., January 1, 2009; 37(suppl_1): D902 - D906. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Barrell, E. Dimmer, R. P. Huntley, D. Binns, C. O'Donovan, and R. Apweiler The GOA database in 2009--an integrated Gene Ontology Annotation resource Nucleic Acids Res., January 1, 2009; 37(suppl_1): D396 - D403. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Goldenberg, E. Erez, G. Nimrod, and N. Ben-Tal The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures Nucleic Acids Res., January 1, 2009; 37(suppl_1): D323 - D327. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Chatr-aryamontri, A. Ceol, D. Peluso, A. Nardozza, S. Panni, F. Sacco, M. Tinti, A. Smolyar, L. Castagnoli, M. Vidal, et al. VirusMINT: a viral protein interaction database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D669 - D673. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Letunic, T. Doerks, and P. Bork SMART 6: recent updates and new developments Nucleic Acids Res., January 1, 2009; 37(suppl_1): D229 - D232. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. R. Croning, M. C. Marshall, P. McLaren, J. D. Armstrong, and S. G. N. Grant G2Cdb: the Genes to Cognition database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D846 - D851. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Fukuchi, K. Homma, S. Sakamoto, H. Sugawara, Y. Tateno, T. Gojobori, and K. Nishikawa The GTOP database in 2009: updated content and novel features to expand and deepen insights into protein structures and functions Nucleic Acids Res., January 1, 2009; 37(suppl_1): D333 - D337. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. J. Richardson, Q. Gao, C. Mitsopoulous, M. Zvelebil, L. H. Pearl, and F. M. G. Pearl MoKCa database--mutations of kinases in cancer Nucleic Acids Res., January 1, 2009; 37(suppl_1): D824 - D831. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. McDowall, M. S. Scott, and G. J. Barton PIPs: human protein-protein interaction prediction database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D651 - D656. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Samarajiwa, S. Forster, K. Auchettl, and P. J. Hertzog INTERFEROME: the database of interferon regulated genes Nucleic Acids Res., January 1, 2009; 37(suppl_1): D852 - D857. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Laskowski PDBsum new things Nucleic Acids Res., January 1, 2009; 37(suppl_1): D355 - D359. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Kuhn, D. Karolchik, A. S. Zweig, T. Wang, K. E. Smith, K. R. Rosenbloom, B. Rhead, B. J. Raney, A. Pohl, M. Pheasant, et al. The UCSC Genome Browser Database: update 2009 Nucleic Acids Res., January 1, 2009; 37(suppl_1): D755 - D761. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Shionyu, A. Yamaguchi, K. Shinoda, K.-i. Takahashi, and M. Go AS-ALPS: a database for analyzing the effects of alternative splicing on protein structure, interaction and network in human and mouse Nucleic Acids Res., January 1, 2009; 37(suppl_1): D305 - D309. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Schmidt, S. Struck, B. Gruening, J. Hossbach, I. S. Jaeger, R. Parol, U. Lindequist, E. Teuscher, and R. Preissner SuperToxic: a comprehensive database of toxic compounds Nucleic Acids Res., January 1, 2009; 37(suppl_1): D295 - D299. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. J. P. Hubbard, B. L. Aken, S. Ayling, B. Ballester, K. Beal, E. Bragin, S. Brent, Y. Chen, P. Clapham, L. Clarke, et al. Ensembl 2009 Nucleic Acids Res., January 1, 2009; 37(suppl_1): D690 - D697. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Robinson, M. J. Waller, S. C. Fail, H. McWilliam, R. Lopez, P. Parham, and S. G. E. Marsh The IMGT/HLA database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D1013 - D1017. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. V. Rajagopala, J. Goll, N.D. D. Gowda, K. C. Sunil, B. Titz, A. Mukherjee, S. S. Mary, N. Raviswaran, C. S. Poojari, S. Ramachandra, et al. MPI-LIT: a literature-curated dataset of microbial binary protein--protein interactions Bioinformatics, November 15, 2008; 24(22): 2622 - 2627. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Jadeau, E. Bechet, A. J. Cozzone, G. Deleage, C. Grangeasse, and C. Combet Identification of the idiosyncratic bacterial protein tyrosine kinase (BY-kinase) family signature Bioinformatics, November 1, 2008; 24(21): 2427 - 2430. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Bjorling and M. Uhlen Antibodypedia, a Portal for Sharing Antibody and Antigen Validation Data Mol. Cell. Proteomics, October 1, 2008; 7(10): 2028 - 2037. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Berglund, E. Bjorling, P. Oksvold, L. Fagerberg, A. Asplund, C. Al-Khalili Szigyarto, A. Persson, J. Ottosson, H. Wernerus, P. Nilsson, et al. A Genecentric Human Protein Atlas for Expression Profiles Based on Antibodies Mol. Cell. Proteomics, October 1, 2008; 7(10): 2019 - 2027. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Goll, S. V. Rajagopala, S. C. Shiau, H. Wu, B. T. Lamb, and P. Uetz MPIDB: the microbial protein interaction database Bioinformatics, August 1, 2008; 24(15): 1743 - 1744. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Hu, E. S. Snitkin, and C. DeLisi VisANT: an integrative framework for networks in systems biology Brief Bioinform, July 1, 2008; 9(4): 317 - 325. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Klein, S. Leupold, R. Munch, C. Pommerenke, T. Johl, U. Karst, L. Jansch, D. Jahn, and I. Retter ProdoNet: identification and visualization of prokaryotic gene regulatory and metabolic networks Nucleic Acids Res., July 1, 2008; 36(suppl_2): W460 - W464. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. E. Martinez-Guerrero, R. Ciria, C. Abreu-Goodger, G. Moreno-Hagelsieb, and E. Merino GeConT 2: gene context analysis for orthologous proteins, conserved domains and metabolic pathways Nucleic Acids Res., July 1, 2008; 36(suppl_2): W176 - W180. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-Y. Chien, D. T.-H. Chang, C.-Y. Chen, Y.-Z. Weng, and C.-M. Hsu E1DS: catalytic site prediction based on 1D signatures of concurrent conservation Nucleic Acids Res., July 1, 2008; 36(suppl_2): W291 - W296. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||














