Nucleic Acids Research, 2005, Vol. 33, Database issue D154-D159
© 2005, the authors
Nucleic Acids Research, Vol. 33, Database issue © Oxford University Press 2005; all rights reserved
The Universal Protein Resource (UniProt)
Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, 1211 Geneva 4, Switzerland, 1 The EMBL OutstationThe European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, 2 Department of Biochemistry and Molecular Biology and 3 National Biomedical Research Foundation, Georgetown University Medical Center, 3900 Reservoir Road, NW, Box 571414, WA 20057-1414, USA
* To whom correspondence should be addressed: Tel: +44 0 1223 494435; Fax: +44 0 1223 494468; Email: apweiler{at}ebi.ac.uk
Received September 14, 2004; Revised and Accepted October 5, 2004
| ABSTRACT |
|---|
|
|
|---|
The Universal Protein Resource (UniProt) provides the scientific community with a single, centralized, authoritative resource for protein sequences and functional information. Formed by uniting the Swiss-Prot, TrEMBL and PIR protein database activities, the UniProt consortium produces three layers of protein sequence databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt) and the UniProt Reference (UniRef) databases. The UniProt Knowledgebase is a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase with extensive cross-references. This centrepiece consists of two sections: UniProt/Swiss-Prot, with fully, manually curated entries; and UniProt/TrEMBL, enriched with automated classification and annotation. During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; we introduced a new comment line topic: TOXIC DOSE to store information on the acute toxicity of a toxin; the UniProt keyword list got augmented by additional keywords; we improved the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications. Furthermore, we introduced a new documentation file of the strains and their synonyms. Many new database cross-references were introduced and we started to make use of Digital Object Identifiers. We also achieved in collaboration with the Macromolecular Structure Database group at EBI an improved integration with structural databases by residue level mapping of sequences from the Protein Data Bank entries onto corresponding UniProt entries. For convenient sequence searches we provide the UniRef non-redundant sequence databases. The comprehensive UniParc database stores the complete body of publicly available protein sequence data. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every two weeks.
| INTRODUCTION |
|---|
|
|
|---|
Previously, Swiss-Prot + TrEMBL (1) and PIR-PSD (2) coexisted as protein databases with differing sequence coverage and annotation priorities. In 2002, the Swiss-Prot + TrEMBL groups at the SIB (Swiss Institute of Bioinformatics) and EBI (European Bioinformatics Institute) and the PIR (Protein Information Resource) group at Georgetown University Medical Center and National Biomedical Research Foundation joined forces as the UniProt consortium (3).
The UniProt consortium maintains three database layers:
- The UniProt Archive (UniParc) provides a stable, comprehensive, non-redundant sequence collection by storing the complete body of publicly available protein sequence data.
- The UniProt Knowledgebase (UniProt) provides the central database of protein sequences with accurate, consistent and rich sequence and functional annotation.
- The UniProt Reference (UniRef) databases provide non-redundant data collections based on the UniProt Knowledgebase and UniParc in order to obtain complete coverage of sequence space at several resolutions.
| THE UniProt ARCHIVE |
|---|
|
|
|---|
Although most protein sequence data are derived from the translation of DDBJ/EMBL/GenBank (4) sequences, primary protein sequence data are also submitted directly to UniProt or appear in patent applications or in entries from the Protein Data Bank (PDB) (5). The UniParc (6) is designed to capture all available protein sequence datanot just from the aforementioned databases, but also from sources such as Ensembl (7), the International Protein Index (IPI) (8), RefSeq (9), FlyBase (10) and WormBase (11). This combination of sources makes UniParc the most comprehensive publicly accessible, non-redundant protein sequence database available.
UniParc represents each protein sequence once and only once, assigning it a unique UniParc identifier. The UniParc release 2.6 from September 2004 contained 4 375 775 unique sequences from 11 978 094 original source records. UniParc cross-references the accession numbers of the source databases, using flags to indicate the status of the entry in the original source database, with active indicating that the entry is still present in the source database and obsolete indicating that the entry no longer exists in the source database. A UniParc sequence version is incremented each time the underlying sequence changes, making it possible to observe sequence changes in all source databases. A sample UniParc report can be found at http://www.uniprot.org/entry/UPI0000000C37. UniParc records carry no annotation, but this information can be found in the UniProt Knowledgebase or other underlying databases.
| THE UniProt KNOWLEDGEBASE |
|---|
|
|
|---|
The UniProt Knowledgebase merges Swiss-Prot, TrEMBL and PIR-PSD to provide a central database of protein sequences with annotations and functional information. All suitable PIR-PSD sequences missing from Swiss-Prot + TrEMBL were incorporated into UniProt and bi-directional cross-references were created to allow the easy tracking of PIR- PSD entries. The transfer into UniProt of references and experimentally verified data present in PIR but missing from Swiss-Prot + TrEMBL is ongoing.
The UniProt Knowledgebase has two parts: a section of fully, manually annotated records resulting from literature information extraction and curator-evaluated computational analysis, and a section with computationally analysed records awaiting full manual annotation. The two sections are referred to as UniProt/Swiss-Prot (158 337 records in UniProt release 2.6 from September 2004) and UniProt/TrEMBL (1 400 776 records in UniProt release 2.6 from September 2004), respectively. An example UniProt report can be found at http://www.uniprot.org/entry/P57727.
In the following paragraphs, we will explain the main principles of the UniProt Knowledgebase and enhancements introduced recently.
High-quality annotation
In addition to capturing the core data mandatory to each UniProt entry (consisting principally of the amino acid sequence, the protein name or description, taxonomic data and citation information), we attach other annotation information both manually and automatically.
Manual annotation is performed by biologists and is based on literature curation and sequence analysis. The annotation principles were described in detail previously (3,12). During 2004, tens of thousands of records were manually annotated or updated. We also have introduced a new comment (CC) line topic: TOXIC DOSE. This topic is used to store information on the poisoning potential (acute toxicity) of a toxin. Generally this topic holds information on the LD50 and PD50. LD stands for Lethal Dose. LD50 is the amount of a toxin, given all at once, which causes the death of 50% (one-half) of a group of test animals. PD50 stands for Paralytic dose. It is the amount of a toxin, which causes the paralysis of 50% of a group of test animals.
Examples:
- CC -!- TOXIC DOSE: PD50 is 1.72 mg/kg by injection in blowfly larvae.
- CC -!- TOXIC DOSE: LD50 is 0.015 mg/kg by intravenous injection for sarafotoxin-A and sarafotoxin-B, and 0.3 mg/kg for sarafotoxin-C.
- CC -!- TOXIC DOSE: LD50 is 0.015 mg/kg by intravenous injection for sarafotoxin-A and sarafotoxin-B, and 0.3 mg/kg for sarafotoxin-C.
Automatic classification and annotation
Much progress was made during 2004 in our attempt to provide automatic large-scale functional characterization and annotation, which is generated with limited human interaction.
InterPro classification
We use InterPro (13) to recognize domains and to classify all the protein sequences in UniProt into families and superfamilies. InterPro is an integrated resource of protein families, domains and sites that amalgamates the efforts of the member databases: Pfam (14), PROSITE (15), PRINTS (16), ProDom (17), SMART (18), PIRSF (19), Superfamily (20) and TIGRFAMs (21). Approximately 80% of all UniProt Knowledgebase records are classified according to their InterPro domains and familes.
Automatic functional annotation of UniProt/TrEMBL
For automatic annotation, systems for standardized transfer of annotation from well-characterized proteins in the UniProt/Swiss-Prot to non-annotated UniProt/TrEMBL entries have been implemented. RuleBase (22) uses a semi-automatic approach, while the Spearmint approach is completely automated and is based on decision trees (23). InterPro is then used to assign UniProt entries into groups. The annotation shared by the functionally characterized UniProt/Swiss-Prot proteins of a group is then extracted and assigned to the non-annotated UniProt/TrEMBL entries of this group. These systems have been used to improve the annotation in 32% (RuleBase) and 55% (Spearmint) of UniProt/TrEMBL entries.
However, a part of the automatically added data will be erroneous, as are parts of the information coming from other sources. Therefore, we introduced a post-processing system called Xanthippe, which is based on a simple exclusion mechanism and a decision tree approach using the C4.5 data-mining algorithm. Xanthippe detects and flags a large part of the annotation errors and considerably increases the reliability of both automatically generated data and pre-existing annotation inherited from the underlying nucleotide sequence source data (24).
The PIRSF classification serves as the basis for a rule-based approach to automatically provide standardized and rich functional annotation for position-specific sequence features, protein names, Enzyme Commission (EC) name and number, keywords and Gene Ontology (GO) terms (25). Position-specific site rules are developed for annotating active site residues, binding site residues, modified residues or other functionally important amino acid residues. To exploit known structure information, site rules are defined starting with PIRSF families that contain at least one known three-dimensional (3D) structure with experimentally verified site information. The rules are defined using appropriate syntax and controlled vocabulary for site description and evidence attribution. As shown in Table 1, each rule consists of the rule ID, template sequence (a representative sequence with known 3D structure), rule condition, feature for propagation (denoting site feature to be propagated) and reference. The rules are family-specific and there may be more than one site rule per family. Site rule curation involves manually editing a multiple sequence alignment of representative family members (including the template PDB entry), visualizing site residues in the 3D structure, and building hidden Markov models for the conserved regions containing the functional site residues (referred to as site HMMs). The HMM thus built allows one to map functionally important residues from the template structure to other members of the PIRSF family that do not have a solved structure.
|
For site feature propagation, the entire rule condition is examined by PIRSF membership checking, site HMM matching and site residue matching. To avoid false positives, site features are only propagated automatically if all site residues match perfectly in the conserved region by aligning both the template and query sequences to the profile HMM using HmmAlign. Potential functional sites missing one or more residues or containing conservative substitutions are only annotated after expert review with evidence attribution. For accurate site propagation, it is sometimes necessary to match more residues in the rule condition than those to be propagated. For example, a total of eight catalytic and binding residues in sulfite reductase need to be matched in order to correctly propagate the sirohaem-ion binding Cys residue (PIRSR000259-3, Table 1).
The highly reliable automatic annotation has already been incorporated into the UniProt/TrEMBL flat files, while additional automatic annotation is available from the extended UniProt view at http://www.ebi.uniprot.org/.
The HAMAP project, or High-quality Automated and Manual Annotation of microbial Proteomes, aims to integrate manual and automatic annotation methods in order to enhance the speed of the curation process while preserving the quality of the database annotation (26). Automatic annotation is only applied to entries that belong to manually defined orthologous families and to entries with no identifiable similarities (ORFans). Many checks are enforced in order to prevent the propagation of wrong annotation and to spot problematic cases, which are channelled to manual curation. The results of this annotation are integrated in UniProt/Swiss-Prot.
Standardized nomenclature and controlled vocabularies
Whenever available, we make use of the official nomenclature defined by international committees while still providing the published synonyms. For various other UniProt items we use controlled vocabularies, e.g. for tissues, plasmids and keywords, which are listed in UniProt documents. The UniProt keyword list was augmented by additional keywords. We improved the documentation of the keywords by adding, to the list of keywords, the definition of their usage in the UniProt knowledgebase and additional information such as synonyms or relevant GO terms. The UniProt curators also contribute to the work of the GOA project (27) by assigning GO terms from each of the GOs, i.e. the function of a protein, what processes it is involved in and where in the cell it is located. A major effort was started to continuously overhaul and standardize the annotation of post-translational modifications (PTMs). Furthermore, we introduced a new documentation file of the strains and their synonyms together with the mnemonic species identification code representing the biological source of the protein in the knowledgebase. These and other documents can be found at http://www.uniprot.org/support/documents.shtml.
Integration with other databases
UniProt provides cross-references to external data collections such as the underlying DNA sequence entries in the DDBJ/EMBL/GenBank nucleotide sequence databases, two dimensional (2D) PAGE and 3D protein structure databases, various protein domain and family characterization databases, PTM databases, species-specific data collections, variant databases and disease databases. Many new cross-references were included over the last year. Accordingly, UniProt acts as a central hub for biomolecular information with now more than four million cross-references to more than 60 databases. A document listing all databases cross-referenced in UniProt (http://www.uniprot.org/support/docs/dbxref.shtml) is available and contains, for each database, a short description and the server URL.
UniProt achieved in 2004 in collaboration with the Macromolecular Structure Database (MSD) group at EBI an improved integration with structural databases by residue level mapping of sequences from the PDB entries onto corresponding UniProt entries (28). This work led to an overhaul of the format of the UniProt cross-references to PDB to reflect the mappings. The UniProtPDB mappings are available at ftp://ftp.ebi.ac.uk/pub/databases/msd/sifts/.
We also started to make use of Digital Object Identifiers (DOIs). The DOI system is used for identifying and exchanging intellectual property in the digital environment. We introduced the new optional identifier DOI in the RX line to store the DOI of a cited document.
Minimal redundancy
Many sequence databases contain, for a given protein sequence, separate entries that correspond to different literature reports. In the UniProt Knowledgebase we try as much as possible to merge all these data in order to minimize the redundancy of the database. Differences between sequencing reports due to splice variants, polymorphisms, disease-causing mutations, experimental sequence modifications or simply sequencing errors are indicated in the feature table of the corresponding UniProt entry.
The UniProt Knowledgebase is therefore by design non-redundant, with the goal of representing all known information regarding a particular protein. The definition of non-redundancy here is different from that employed in UniParc: in UniParc, all sequences that are 100% identical over their entire length are merged into a single entry, regardless of species; the UniProt Knowledgebase aims to describe in a single record all protein products derived from a certain gene (or genes if the translation from different genes in a genome leads to indistinguishable proteins) from a certain species and to give not only the whole record an accession number but to assign to each protein form derived by alternative splicing, proteolytic cleavage and post-translational modification Isoform identifiers, which are accession numbers for the isoforms. The underlying reason for giving each of these isoforms a unique identifier is that each of these may have a different function or biological role or may only exist during specific developmental stages or under certain environmental conditions, even when all these isoforms are derived from a single gene. Isoform identifiers have been so far only introduced for splice isoforms. Splice isoforms may differ considerably from one another, with potentially <50% sequence similarity between isoforms. The tool VARSPLIC (29), which is freely available, enables the recreation of all annotated splice variants from the feature table of a UniProt entry, or for the complete database. A FASTA-formatted file containing all splice variants annotated in UniProt can be downloaded for use with similarity search programs.
Evidence attribution
The UniProt consortium emphasizes the use of an evidence attribution mechanism for protein annotation that will include, for all data, the data source, the types of evidence and methods for annotation. This is essential as the UniProt Knowledgebase will contain data automatically imported from the underlying nucleotide sequence databases, data imported from other databases, data from specific programs, the results of automatic annotation systems and, most importantly, expert manual curation. The implementation of evidence tags will allow the user to distinguish between these data sources and to easily identify particular classes of data of interest such as experimentally proven protein annotation. Evidence tags for the annotation present in UniProt/TrEMBL records are already available in the UniProt XML distribution.
| THE UniProt REFERENCE DATABASES |
|---|
|
|
|---|
Automatic procedures have been developed to create three UniRef databases, such as UniRef100, UniRef90 and UniRef50, from the UniProt Knowledgebase and UniParc as representative protein sequence databases with high information content. The databases provide complete coverage of sequence space while hiding redundant sequences from view. The non-redundancy facilitates sequence merging in the UniProt Knowledgebase (based on UniRef100) and allows faster sequence similarity searches (by using UniRef90 and UniRef50).
UniRef100 provides a comprehensive non-redundant sequence collection clustered by sequence identity. UniRef merges sequences automatically across different species and also adds some data from UniParc, such as translations from highly unstable gene predictions; while merging in the Knowledgebase is restricted to curator-assisted inclusion of reliable and stable sequence data for a single species. UniRef100 is based on all UniProt Knowledgebase records, as well as UniParc records that represent sequences deemed over-represented in the Knowledgebase, DDBJ/EMBL/GenBank Whole Genome Shotgun coding sequence translations, Ensembl protein translations from various organisms, as well as IPI data. The production of UniRef100 begins with the clustering of all records by sequence identity. Identical sequences and subfragments are presented as a single UniRef100 entry, containing the accession numbers of all merged entries, and the protein sequence. The UniRef100 release 2.6 from September 2004 contained 2 611 612 records derived from the corresponding UniProt knowledgebase and UniParc releases.
UniRef90 and UniRef50 are built from UniRef100 using the CD-HIT algorithm (30) to provide non-redundant sequence collections for the scientific user community to perform faster homology searches. All records from all source organisms with mutual sequence identity of >90 or >50%, respectively, are merged into a single record that links to the corresponding UniProt Knowledgebase records. UniRef90 and UniRef50 yield a size reduction of
40 and 65%, respectively. A sample UniRef90 report can be found at http://www.uniprot.org/entry/uniref90_P57727.
| PRACTICAL INFORMATION |
|---|
|
|
|---|
Interactive access and linking to UniProt
The most efficient and user-friendly way to browse the UniProt databases is via the UniProt website (http://www.uniprot.org), which serves as a portal to all aspects of the UniProt project, and contains detailed documentation about the background and scope of UniProt. It provides database query and data-mining mechanisms, user support and communication, file download capabilities, and links to consortium resources. The UniProt Help Desk (help{at}uniprot.org) provides access to UniProt curators and database maintainers.
The standard way of linking to UniProt, displaying the UniProt basic view as HTML, is: http://www.uniprot.org/entry/entryname or accession number.
Examples:
- http://www.uniprot.org/entry/cyc_human
- http://www.uniprot.org/entry/P99999
- http://www.uniprot.org/entry/UniRef100_P99999
- http://www.uniprot.org/entry/UniRef90_P99999
- http://www.uniprot.org/entry/UniRef50_P99999
- http://www.uniprot.org/entry/UPI00000002E4
- http://www.uniprot.org/entry/P99999
UniProt data availability and submission
UniProt, UniParc and UniRef entries, with supporting documentation, can be retrieved in various formats (Swiss-Prot/TrEMBL flat file, FASTA, XML) via anonymous FTP from ftp://ftp.uniprot.org/pub/. New UniProt, UniParc and UniRef releases are produced every two weeks.
UniProt accepts submissions of new sequences, entry updates and corrections, and annotated bibliographic information for protein entries. Directions for submission are available at http://www.uniprot.org/support/submissions.shtml.
| CONCLUSIONS |
|---|
|
|
|---|
Complete and up-to-date databases of biological knowledge are vital for information-dependent biological and biotechnological research. With the rapid accumulation of genome sequences for many organisms, attention is turning to the identification and functions of proteins encoded by these genomes. With the increasing volume and variety of protein sequences and functional information, UniProt serves as a central resource of protein sequence and function, providing a cornerstone for scientists active in modern biological research. The resource provides rich, consistent and non-redundant protein information by combining reliable automated annotation approaches with literature-based expert manual curation.
| ACKNOWLEDGEMENTS |
|---|
UniProt is mainly supported by the National Institutes of Health (NIH) grant 1 U01 HG02712-01. Minor support for the EBIs involvement in UniProt comes from the two European Union contracts BioBabel (QLRT-2000-00981) and TEMBLOR (QLRI-2001-00015) and from the NIH grant 1R01HGO2273-01. UniProt/Swiss-Prot activities at the SIB are supported by the Swiss Federal Government through the Federal Office of Education and Science. PIR activities are also supported by the National Science Foundation (NSF) grants DBI-0138188 and ITR-0205470.
| Notes |
|---|
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use permissions, please contact journals.permissions{at}oupjournals.org.
| REFERENCES |
|---|
|
|
|---|
- Boeckmann,B., Bairoch,A., Apweiler,R., Blatter,M., Estreicher,A., Gasteiger,E., Martin,M.J., Michoud,K., O'Donovan,C., Phan,I. et al. ( (2003) ) The Swiss-Prot protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res., , 31, , 365370.
[Abstract/Free Full Text] . - Wu,C.H., Yeh,L.-S.L., Huang,H., Arminski,L., Castro-Alvear,J., Chen,Y., Hu,Z., Kourtesis,P., Ledley,R.S., Suzek,B.E. et al. ( (2003) ) The Protein Information Resource. Nucleic Acids Res., , 31, , 345347.
[Abstract/Free Full Text] . - Apweiler,R., Bairoch,A., Wu,C.H., Barker,W.C., Boeckmann,B., Ferro,S., Gasteiger,E., Huang,H., Lopez,R., Magrane,M. et al. ( (2004) ) UniProt: the Universal Protein knowledgebase. Nucleic Acids Res., , 32, , D115D119.
[Abstract/Free Full Text] . - Kulikova,T., Aldebert,P., Althorpe,N., Baker,W., Bates,K., Browne,P., van den Broek,A., Cochrane,G., Duggan,K., Eberhardt,R. et al. ( (2004) ) The EMBL Nucleotide Sequence Database. Nucleic Acids Res., , 32, , D27D30.
[Abstract/Free Full Text] . - Westbrook,J., Feng,Z., Chen,L., Yang,H. and Berman,H. ( (2003) ) The Protein Data Bank and structural genomics. Nucleic Acids Res., , 31, , 489491.
[Abstract/Free Full Text] . - Leinonen,R., Diez,F.G., Binns,D., Fleischmann,W., Lopez,R. and Apweiler,R. ( (2004) ) UniProt Archive. Bioinformatics, , 20, , 32363237.
[Abstract/Free Full Text] . - Hubbard,T., Barker,D., Birney,E., Cameron,G., Chen,Y., Clark,L., Cox,T., Cuff,J., Curwen,V., Down,T. et al. ( (2002) ) The Ensembl genome database project. Nucleic Acids Res., , 30, , 3841.
[Abstract/Free Full Text] . - Kersey,P.J., Duarte,J., Williams,A., Karavidopoulou,Y., Birney,E. and Apweiler,R. ( (2004) ) The International Protein Index: an integrated database for proteomics experiments. Proteomics, , 4, , 19851988.[CrossRef][Web of Science][Medline] .
- Pruitt,K. and Maglott,D. ( (2001) ) RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res., , 29, , 137140.
[Abstract/Free Full Text] . - FlyBase Consortium ( (2003) ) The FlyBase database of the Drosophila genome projects and community literature. Nucleic Acids Res., , 31, , 172175.
[Abstract/Free Full Text] . - Harris,T., Lee,R., Schwarz,E., Bradnam,K., Lawson,D., Chen,W., Blasier,D., Kenny,E., Cunningham,F., Kishore,R. et al. ( (2003) ) WormBase: a cross-species database for comparative genomics. Nucleic Acids Res., , 31, , 133137.
[Abstract/Free Full Text] . - Apweiler,R., Bairoch,A. and Wu,C.H. ( (2004) ) Protein sequence databases. Curr. Opin. Chem. Biol., , 8, , 7680.[CrossRef][Web of Science][Medline] .
- Mulder,N., Apweiler,R., Attwood,T., Bairoch,A., Barrell,D., Bateman,A., Binns,D., Biswas,M., Bradley,P., Bork,P. et al. ( (2003) ) The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res., , 31, , 315318.
[Abstract/Free Full Text] . - Bateman,A., Birney,E., Cerruti,L., Durbin,R., Etwiller,L., Eddy,S.R., Griffiths-Jones,S., Howe,K.L., Marshall,M. and Sonnhammer,E.L.L. ( (2002) ) The Pfam Protein Families Database. Nucleic Acids Res., , 30, , 276280.
[Abstract/Free Full Text] . - Hulo,N., Sigrist,C.J.A., Le Saux,V., Langendijk-Genevaux,P.S., Bordoli,L., Gattiker,A., De Castro,E., Bucher,P. and Bairoch,A. ( (2004) ) Recent improvements to the PROSITE database. Nucleic Acids Res., , 32, , D134D137.
[Abstract/Free Full Text] . - Attwood,T.K., Bradley,P., Flower,D.R., Gaulton,A., Maudling,N., Mitchell,A.L., Moulton,G., Nordle,A., Paine,K., Taylor,P. et al. ( (2003) ) PRINTS and its automatic supplement, preprints. Nucleic Acids Res., , 31, , 400402.
[Abstract/Free Full Text] . - Servant,F., Bru,C., Carrere,S., Courcelle,E., Couzy,J., Peyruc,D. and Kahn,D. ( (2002) ) Prodom: automated clustering of homologous domains. Brief. Bioinformatics, , 3, , 246251.
[Abstract/Free Full Text] . - Letunic,I., Goodstadt,L., Dickens,N.J., Doerks,T., Schultz,J., Mott,R., Ciccarelli,F., Copley,R.R., Ponting,C.P. and Bork,P. ( (2002) ) Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res., , 30, , 242244.
[Abstract/Free Full Text] . - Wu,C.H., Nikolskaya,A., Huang,H., Yeh,L.-S., Natale,D., Vinayaka,C.R., Hu,Z., Mazumder,R., Kumar,S., Kourtesis,P. et al. ( (2004) ) PIRSF family classification system at the Protein Information Resource. Nucleic Acids Res., , 32, , D112D114.
[Abstract/Free Full Text] . - Gough,J., Karplus,K., Hughey,R. and Chothia,C. ( (2001) ) Assignment of homology to genome sequences using a library of hidden markov models that represent all proteins of known structure. J. Mol. Biol., , 313, , 903919.[CrossRef][Web of Science][Medline] .
- Haft,D.H., Loftus,B.J., Richardson,D.L., Yang,F., Eisen,J.A., Paulsen,I.T. and White,O. ( (2001) ) TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res., , 29, , 4143.
[Abstract/Free Full Text] . - Fleischmann,W., Moeller,S., Gateau,A. and Apweiler,R. ( (1999) ) A novel method for automatic and reliable functional annotation. Bioinformatics, , 15, , 228233.
[Abstract/Free Full Text] . - Kretschmann,E., Fleischmann,W. and Apweiler,R. ( (2001) ) Automatic rule generation for protein annotation with the C4.5 data mining algorithm applied on SWISS-PROT. Bioinformatics, , 17, , 920926.
[Abstract/Free Full Text] . - Wieser,D., Kretschmann,E. and Apweiler,R. ( (2004) ) Filtering erroneous protein annotation. Bioinformatics, , 20, , i342i347.[Abstract] .
- Wu,C.H., Huang,H., Yeh,L.-S. and Barker,W.C. ( (2003) ) Protein family classification and functional annotation. Comput. Biol. Chem., , 27, , 3747.[CrossRef][Web of Science][Medline] .
- Gattiker,A., Michoud,K., Rivoire,C., Auchincloss,A.H., Coudert,E., Lima,T., Kersey,P., Pagni,M., Sigrist,C.J.A., Lachaize,C. et al. ( (2003) ) Automatic annotation of microbial proteomes in Swiss-Prot. Comput. Biol. Chem., , 27, , 4958.[CrossRef][Web of Science][Medline] .
- Camon,E., Magrane,M., Barrell,D., Lee,V., Dimmer,E., Maslen,J., Binns,D., Harte,N., Lopez,R. and Apweiler R. ( (2004) ) The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res., , 32, , D262D266.
[Abstract/Free Full Text] . - Velankar,S., McNeil,P., Mittard-Runte,V., Suarez,A., Barrell,D., Apweiler,R. and Henrick,K. ( (2005) ) E-MSD: an integrated data resource for bioinformatics. Nucleic Acids Res., , 33, , 262265. .
- Kersey,P., Hermjakob,H. and Apweiler,R. ( (2000) ) VARSPLIC: alternatively-spliced protein sequences derived from Swiss-Prot and TrEMBL. Bioinformatics, , 11, , 10481049. .
- Li,W., Jaroszewski,L. and Godzik,A. ( (2002) ) Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics, , 18, , 7782.
.
This article has been cited by other articles:
![]() |
T. Davidsen, E. Beck, A. Ganapathy, R. Montgomery, N. Zafar, Q. Yang, R. Madupu, P. Goetz, K. Galinsky, O. White, et al. The comprehensive microbial resource Nucleic Acids Res., November 5, 2009; (2009) gkp912v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Brinkac, T. Davidsen, E. Beck, A. Ganapathy, E. Caler, R. J. Dodson, A. S. Durkin, D. M. Harkins, H. Lorenzi, R. Madupu, et al. Pathema: a clade-specific bioinformatics resource center for pathogen research Nucleic Acids Res., October 20, 2009; (2009) gkp850v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Boehm, J. Nield, P. Zhang, E.-M. Aro, J. Komenda, and P. J. Nixon Structural and Mutational Analysis of Band 7 Proteins in the Cyanobacterium Synechocystis sp. Strain PCC 6803 J. Bacteriol., October 15, 2009; 191(20): 6425 - 6435. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Rancurel, M. Khosravi, A. K. Dunker, P. R. Romero, and D. Karlin Overlapping Genes Produce Proteins with Unusual Sequence Properties and Offer Insight into De Novo Protein Creation J. Virol., October 15, 2009; 83(20): 10719 - 10736. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Encinar, G. Fernandez-Ballester, I. E. Sanchez, E. Hurtado-Gomez, F. Stricher, P. Beltrao, and L. Serrano ADAN: a database for prediction of protein-protein interaction of modular domains mediated by linear motifs Bioinformatics, September 15, 2009; 25(18): 2418 - 2424. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Benita, H. Kikuchi, A. D. Smith, M. Q. Zhang, D. C. Chung, and R. J. Xavier An integrative genomics approach identifies Hypoxia Inducible Factor-1 (HIF-1)-target genes that form the core response to hypoxia Nucleic Acids Res., August 1, 2009; 37(14): 4587 - 4602. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-H. Nam, D.-W. Kim, T.-S. Jung, Y.-S. Choi, D.-W. Kim, H.-S. Choi, S.-H. Choi, and H.-S. Park PESTAS: a web server for EST analysis and sequence mining Bioinformatics, July 15, 2009; 25(14): 1846 - 1848. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Chitale, T. Hawkins, C. Park, and D. Kihara ESG: extended similarity group method for automated protein function prediction Bioinformatics, July 15, 2009; 25(14): 1739 - 1745. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Kittichotirat, M. Guerquin, R. E. Bumgarner, and R. Samudrala Protinfo PPC: A web server for atomic level prediction of protein complexes Nucleic Acids Res., July 1, 2009; 37(suppl_2): W519 - W525. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. D'Aloisio, J. C. Schroeder, K. E. North, C. Poole, S. L. West, G. S. Travlos, and D. D. Baird IGF-I and IGFBP-3 Polymorphisms in Relation to Circulating Levels among African American and Caucasian Women Cancer Epidemiol. Biomarkers Prev., March 1, 2009; 18(3): 954 - 966. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Torii, Z. Hu, C. H. Wu, and H. Liu BioTagger-GM: A Gene/Protein Name Recognition System J. Am. Med. Inform. Assoc., March 1, 2009; 16(2): 247 - 255. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Yong, B. Tolner, S. Nagl, R.B. Pedley, K. Chester, A.J. Green, A. Mayer, S. Sharma, and R. Begent Data standards for minimum information collection for antibody therapy experiments Protein Eng. Des. Sel., March 1, 2009; 22(3): 221 - 224. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Chelala, A. Khan, and N. R Lemoine SNPnexus: a web database for functional annotation of newly discovered and public domain single nucleotide polymorphisms Bioinformatics, March 1, 2009; 25(5): 655 - 661. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Chautard, L. Ballut, N. Thierry-Mieg, and S. Ricard-Blum MatrixDB, a database focused on extracellular protein-protein and protein-carbohydrate interactions Bioinformatics, March 1, 2009; 25(5): 690 - 691. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Toll-Riera, N. Bosch, N. Bellora, R. Castelo, L. Armengol, X. Estivill, and M. Mar Alba Origin of Primate Orphan Genes: A Comparative Genomics Approach Mol. Biol. Evol., March 1, 2009; 26(3): 603 - 612. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Dam, B. S. Laursen, J. H. Ornfelt, B. Jochimsen, H. H. Staerfeldt, C. Friis, K. Nielsen, N. Goffard, S. Besenbacher, L. Krusell, et al. The Proteome of Seed Development in the Model Legume Lotus japonicus Plant Physiology, March 1, 2009; 149(3): 1325 - 1340. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Baumbach, A. Tauch, and S. Rahmann Towards the integrated analysis, visualization and reconstruction of microbial gene regulatory networks Brief Bioinform, January 1, 2009; 10(1): 75 - 83. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Kiefer, K. Arnold, M. Kunzli, L. Bordoli, and T. Schwede The SWISS-MODEL Repository and associated resources Nucleic Acids Res., January 1, 2009; 37(suppl_1): D387 - D392. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Kandasamy, S. Keerthikumar, R. Goel, S. Mathivanan, N. Patankar, B. Shafreen, S. Renuse, H. Pawar, Y. L. Ramachandra, P. K. Acharya, et al. Human Proteinpedia: a unified discovery resource for proteomics research Nucleic Acids Res., January 1, 2009; 37(suppl_1): D773 - D781. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Pieper, N. Eswar, B. M. Webb, D. Eramian, L. Kelly, D. T. Barkan, H. Carter, P. Mankoo, R. Karchin, M. A. Marti-Renom, et al. MODBASE, a database of annotated comparative protein structure models and associated resources Nucleic Acids Res., January 1, 2009; 37(suppl_1): D347 - D354. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Driscoll, M. D. Dyer, T. M. Murali, and B. W. Sobral PIG--the pathogen interaction gateway Nucleic Acids Res., January 1, 2009; 37(suppl_1): D647 - D650. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Sherman, T. Martin, M. Nikolski, C. Cayla, J.-L. Souciet, P. Durrens, and for the Genolevures Consortium Genolevures: protein families and synteny among complete hemiascomycetous yeast proteomes and genomes Nucleic Acids Res., January 1, 2009; 37(suppl_1): D550 - D554. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Shemesh, A. Toporik, Z. Levine, I. Hecht, G. Rotman, A. Wool, D. Dahary, E. Gofer, Y. Kliger, M. A. Soffer, et al. Discovery and Validation of Novel Peptide Agonists for G-protein-coupled Receptors J. Biol. Chem., December 12, 2008; 283(50): 34643 - 34649. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Sompallae, S. Gastaldello, S. Hildebrand, N. Zinin, G. Hassink, K. Lindsten, J. Haas, B. Persson, and M. G. Masucci Epstein-Barr Virus Encodes Three Bona Fide Ubiquitin-Specific Proteases J. Virol., November 1, 2008; 82(21): 10477 - 10486. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Akiva, Z. Itzhaki, and H. Margalit Built-in loops allow versatility in domain-domain interactions: Lessons from self-interacting domains PNAS, September 9, 2008; 105(36): 13292 - 13297. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Grosso, A. Q. Gomes, N. L. Barbosa-Morais, S. Caldeira, N. P. Thorne, G. Grech, M. von Lindern, and M. Carmo-Fonseca Tissue-specific splicing factor gene expression signatures Nucleic Acids Res., September 1, 2008; 36(15): 4823 - 4832. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Bromberg and B. Rost Comprehensive in silico mutagenesis highlights functionally important residues in proteins Bioinformatics, August 15, 2008; 24(16): i207 - i212. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Liu, A. Nauta, C. Francke, and R. J. Siezen Comparative Genomics of Enzymes in Flavor-Forming Pathways from Amino Acids in Lactic Acid Bacteria Appl. Envir. Microbiol., August 1, 2008; 74(15): 4590 - 4600. [Full Text] [PDF] |
||||
![]() |
M. Michaut, S. Kerrien, L. Montecchi-Palazzi, F. Chauvat, C. Cassier-Chauvat, J.-C. Aude, P. Legrain, and H. Hermjakob InteroPORC: automated inference of highly conserved protein interaction networks Bioinformatics, July 15, 2008; 24(14): 1625 - 1631. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Capra and M. Singh Characterization and prediction of residues determining protein functional specificity Bioinformatics, July 1, 2008; 24(13): 1473 - 1480. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Lemoine, B. Labedan, and C. Froidevaux GenoQuery: a new querying module for functional annotation in a genomic warehouse Bioinformatics, July 1, 2008; 24(13): i322 - i329. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-H. Chen, C.-K. Liu, S.-C. Chang, Y.-J. Lin, M.-F. Tsai, Y.-T. Chen, and A. Yao GenoWatch: a disease gene mining browser for association study Nucleic Acids Res., July 1, 2008; 36(suppl_2): W336 - W340. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Lee, G.-S. Yi, and J. C. Park E3Miner: a text mining tool for ubiquitin-protein ligases Nucleic Acids Res., July 1, 2008; 36(suppl_2): W416 - W422. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. E. Tusnady, L. Kalmar, H. Hegyi, P. Tompa, and I. Simon TOPDOM: database of domains and motifs with conservative location in transmembrane proteins Bioinformatics, June 15, 2008; 24(12): 1469 - 1470. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Hackenberg and R. Matthiesen Annotation-Modules: a tool for finding significant combinations of multisource annotations for gene lists Bioinformatics, June 1, 2008; 24(11): 1386 - 1393. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Cushman, R. L. Tillett, J. A. Wood, J. M. Branco, and K. A. Schlauch Large-scale mRNA expression profiling in the common ice plant, Mesembryanthemum crystallinum, performing C3 photosynthesis and Crassulacean acid metabolism (CAM) J. Exp. Bot., May 1, 2008; 59(7): 1875 - 1894. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. R. Southey, J. V. Sweedler, and S. L. Rodriguez-Zas Prediction of neuropeptide cleavage sites in insects Bioinformatics, March 15, 2008; 24(6): 815 - 825. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Jerg and U. Gerischer Relevance of nucleotides of the PcaU binding site from Acinetobacter baylyi Microbiology, March 1, 2008; 154(3): 756 - 766. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Xie, G. Onsongo, J. Popko, E. P. de Jong, J. Cao, J. V. Carlis, R. J. Griffin, N. L. Rhodus, and T. J. Griffin Proteomics Analysis of Cells in Whole Saliva from Oral Cancer Patients via Value-added Three-dimensional Peptide Fractionation and Tandem Mass Spectrometry Mol. Cell. Proteomics, March 1, 2008; 7(3): 486 - 498. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. V. Tetko, I. V. Rodchenkov, M. C. Walter, T. Rattei, and H.-W. Mewes Beyond the 'best' match: machine learning annotation of protein sequences by integration of different sources of information Bioinformatics, March 1, 2008; 24(5): 621 - 628. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. N.I. Pang, K. Lin, M. A. Wouters, J. Heringa, and R. A. George Identifying foldable regions in protein sequence from the hydrophobic signal Nucleic Acids Res., February 2, 2008; 36(2): 578 - 588. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Cote-Martin, C. Moody, A. Fradet-Turcotte, C. M. D'Abramo, M. Lehoux, S. Joubert, G. G. Poirier, B. Coulombe, L. A. Laimins, and J. Archambault Human Papillomavirus E1 Helicase Interacts with the WD Repeat Protein p80 To Promote Maintenance of the Viral Genome in Keratinocytes J. Virol., February 1, 2008; 82(3): 1271 - 1283. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. E. Tusnady, L. Kalmar, and I. Simon TOPDB: topology data bank of transmembrane proteins Nucleic Acids Res., January 11, 2008; 36(suppl_1): D234 - D239. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Sprenger, J. Lynn Fink, S. Karunaratne, K. Hanson, N. A. Hamilton, and R. D. Teasdale LOCATE: a mammalian protein subcellular localization database Nucleic Acids Res., January 11, 2008; 36(suppl_1): D230 - D233. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Ulintz, B. Bodenmiller, P. C. Andrews, R. Aebersold, and A. I. Nesvizhskii Investigating MS2/MS3 Matching Statistics: A Model For Coupling Consecutive Stage Mass Spectrometry Data For Increased Peptide Identification Confidence Mol. Cell. Proteomics, January 1, 2008; 7(1): 71 - 87. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Hollich and E. L.L. Sonnhammer PfamAlyzer: domain-centric homology search Bioinformatics, December 15, 2007; 23(24): 3382 - 3383. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. S. Domingues, J. Rahnenfuhrer, and T. Lengauer Conformational analysis of alternative protein structures Bioinformatics, December 1, 2007; 23(23): 3131 - 3138. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Spudich, X. M. Fernandez-Suarez, and E. Birney Genome browsing with Ensembl: a practical overview Brief Funct Genomic Proteomic, October 29, 2007; (2007) elm025v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Suderman and M. Hallett Tools for visually exploring biological networks Bioinformatics, October 15, 2007; 23(20): 2651 - 2659. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Zucker, A. Zivelin, M. Landau, O. Salomon, G. Kenet, F. Bauduer, M. Samama, J. Conard, M.-H. Denninger, A.-S. Hani, et al. Characterization of seven novel mutations causing factor XI deficiency Haematologica, October 1, 2007; 92(10): 1375 - 1380. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Cox, S. Birnbaum, M. C. Mahaney, D. L. Rainwater, J. T. Williams, and J. L. VandeBerg Identification of Promoter Variants in Baboon Endothelial Lipase That Regulate High-Density Lipoprotein Cholesterol Levels Circulation, September 4, 2007; 116(10): 1185 - 1195. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Fisher, C. Hedeler, K. Wolstencroft, H. Hulme, H. Noyes, S. Kemp, R. Stevens, and A. Brass A systematic strategy for large-scale analysis of genotype phenotype correlations: identification of candidate genes involved in African trypanosomiasis Nucleic Acids Res., August 20, 2007; (2007) gkm623v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. G. Glanville, D. Kirshner, N. Krishnamurthy, and K. Sjolander Berkeley Phylogenomics Group web servers: resources for structural phylogenomic analysis Nucleic Acids Res., July 13, 2007; 35(suppl_2): W27 - W32. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Horton, K.-J. Park, T. Obayashi, N. Fujita, H. Harada, C.J. Adams-Collier, and K. Nakai WoLF PSORT: protein localization predictor Nucleic Acids Res., July 13, 2007; 35(suppl_2): W585 - W587. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. E. Davey, R. J. Edwards, and D. C. Shields The SLiMDisc server: short, linear motif discovery in proteins Nucleic Acids Res., July 13, 2007; 35(suppl_2): W455 - W459. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Goffard and G. Weiller PathExpress: a web-based tool to identify relevant pathways in gene expression data Nucleic Acids Res., July 13, 2007; 35(suppl_2): W176 - W181. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-W. Fan and C. Friedman Semantic Classification of Biomedical Concepts Using Distributional Similarity J. Am. Med. Inform. Assoc., July 1, 2007; 14(4): 467 - 477. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Dyer, T. M. Murali, and B. W. Sobral Computational prediction of host-pathogen protein protein interactions Bioinformatics, July 1, 2007; 23(13): i159 - i166. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. G. Faux, G. A. Huttley, K. Mahmood, G. I. Webb, M. Garcia de la Banda, and J. C. Whisstock RCPdb: An evolutionary classification and codon usage database for repeat-containing proteins Genome Res., July 1, 2007; 17(7): 1118 - 1127. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Bromberg and B. Rost SNAP: predict effect of non-synonymous polymorphisms on function Nucleic Acids Res., June 28, 2007; 35(11): 3823 - 3835. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Zheng, A. Frankish, R. Baertsch, P. Kapranov, A. Reymond, S. W. Choo, Y. Lu, F. Denoeud, S. E. Antonarakis, M. Snyder, et al. Pseudogenes in the ENCODE regions: Consensus annotation, analysis of transcription, and evolution Genome Res., June 1, 2007; 17(6): 839 - 851. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. L. Connors, J. Jokinen, D. J. White, J. S. Puranen, P. Kankaanpaa, P. Upla, M. Tulla, M. S. Johnson, and J. Heino Two Synergistic Activation Mechanisms of {alpha}2beta1 Integrin-mediated Collagen Binding J. Biol. Chem., May 11, 2007; 282(19): 14675 - 14683. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. J. Gaulton, K. L. Mohlke, and T. J. Vision A computational system to select candidate genes for complex human traits Bioinformatics, May 1, 2007; 23(9): 1132 - 1140. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. E. Gentle, A. J. Perry, F. H. Alcock, V. A. Likic, P. Dolezal, E. T. Ng, A. W. Purcell, M. McConnville, T. Naderer, A.-L. Chanez, et al. Conserved Motifs Reveal Details of Ancestry and Structure in the Small TIM Chaperones of the Mitochondrial Intermembrane Space Mol. Biol. Evol., May 1, 2007; 24(5): 1149 - 1160. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. P. Manes, J. K. Gustin, J. Rue, H. M. Mottaz, S. O. Purvine, A. D. Norbeck, M. E. Monroe, J. S. D. Zimmer, T. O. Metz, J. N. Adkins, et al. Targeted Protein Degradation by Salmonella under Phagosome-mimicking Culture Conditions Investigated Using Comparative Peptidomics Mol. Cell. Proteomics, April 1, 2007; 6(4): 717 - 727. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. L. Strope, S. D. Scott, and E. N. Moriyama indel-Seq-Gen: A New Protein Family Simulator Incorporating Domains, Motifs, and Indels Mol. Biol. Evol., March 1, 2007; 24(3): 640 - 649. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Thilakaraj, K. Raghunathan, S. Anishetty, and G. Pennathur In silico identification of putative metal binding motifs Bioinformatics, February 1, 2007; 23(3): 267 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Kaminker, Y. Zhang, A. Waugh, P. M. Haverty, B. Peters, D. Sebisanovic, J. Stinson, W. F. Forrest, J. F. Bazan, S. Seshagiri, et al. Distinguishing Cancer-Associated Missense Mutations from Common Polymorphisms Cancer Res., January 15, 2007; 67(2): 465 - 473. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. Wishart, D. Tzur, C. Knox, R. Eisner, A. C. Guo, N. Young, D. Cheng, K. Jewell, D. Arndt, S. Sawhney, et al. HMDB: the Human Metabolome Database Nucleic Acids Res., January 12, 2007; 35(suppl_1): D521 - D526. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Andreeva, A. Prlic, T. J. P. Hubbard, and A. G. Murzin SISYPHUS--structural alignments for proteins with non-trivial relationships Nucleic Acids Res., January 12, 2007; 35(suppl_1): D253 - D259. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. L. Holliday, D. E. Almonacid, G. J. Bartlett, N. M. O'Boyle, J. W. Torrance, P. Murray-Rust, J. B. O. Mitchell, and J. M. Thornton MACiE (Mechanism, Annotation and Classification in Enzymes): novel tools for searching catalytic mechanisms Nucleic Acids Res., January 12, 2007; 35(suppl_1): D515 - D520. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Childs, J. P. Hamilton, W. Zhu, E. Ly, F. Cheung, H. Wu, P. D. Rabinowicz, C. D. Town, C. R. Buell, and A. P. Chan The TIGR Plant Transcript Assemblies database Nucleic Acids Res., January 12, 2007; 35(suppl_1): D846 - D851. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Gromiha, Y. Yabuki, S. Kundu, S. Suharnan, and M. Suwa TMBETA-GENOME: database for annotated {beta}-barrel membrane proteins in genomic sequences Nucleic Acids Res., January 12, 2007; 35(suppl_1): D314 - D316. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. D'Agostino, M. Aversano, L. Frusciante, and M. L. Chiusano TomatEST database: in silico exploitation of EST data to explore expression patterns in tomato species Nucleic Acids Res., January 12, 2007; 35(suppl_1): D901 - D905. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lopez, A. Valencia, and M. Tress FireDB--a database of functionally important residues from proteins of known structure Nucleic Acids Res., January 12, 2007; 35(suppl_1): D219 - D223. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. H. Greene, T. E. Lewis, S. Addou, A. Cuff, T. Dallman, M. Dibley, O. Redfern, F. Pearl, R. Nambudiry, A. Reid, et al. The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution Nucleic Acids Res., January 12, 2007; 35(suppl_1): D291 - D297. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Riley, T. Schmidt, I. I. Artamonova, C. Wagner, A. Volz, K. Heumann, H.-W. Mewes, and D. Frishman PEDANT genome database: 10 years online Nucleic Acids Res., January 12, 2007; 35(suppl_1): D354 - D357. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Kidron, S. Repo, M. S. Johnson, and T. A. Salminen Functional Classification of Amino Acid Decarboxylases from the Alanine Racemase Structural Family by Phylogenetic Studies Mol. Biol. Evol., January 1, 2007; 24(1): 79 - 89. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




















