Article |
The Universal Protein Resource (UniProt): an expanding universe of protein information
Department of Biochemistry and Molecular Biology, Georgetown University Medical Center 3900 Reservoir Road, NW, Washington, DC 20057-1414, USA 1The EMBL Outstation, The European Bioinformatics Institute Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK 2Swiss Institute of Bioinformatics, Centre Medical Universitaire 1 rue Michel Servet, 1211 Geneva 4, Switzerland 3National Biomedical Research Foundation 3900 Reservoir Road, NW, Washington, DC 20057-1414, USA
*To whom correspondence should be addressed. Tel: +44 1223 494435; Fax: +44 1223 494468; Email: apweiler{at}ebi.ac.uk
Received September 17, 2005. Revised October 31, 2005. Accepted October 31, 2005.
The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/databases/.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y. Loewenstein and M. Linial Connect the dots: exposing hidden protein family connections from the entire sequence tree Bioinformatics, August 15, 2008; 24(16): i193 - i199. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Forslund and E. L. L. Sonnhammer Predicting protein function from domain content Bioinformatics, August 1, 2008; 24(15): 1681 - 1687. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Dittrich, I. Birschmann, S. Mietner, A. Sickmann, U. Walter, and T. Dandekar Platelet Protein Interactions: Map, Signaling Components, and Phosphorylation Groundstate Arterioscler. Thromb. Vasc. Biol., July 1, 2008; 28(7): 1326 - 1331. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Lee and D. Lee DAhunter: a web-based server that identifies homologous proteins by comparing domain architecture Nucleic Acids Res., July 1, 2008; 36(suppl_2): W60 - W64. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Pei, M. Tang, and N. V. Grishin PROMALS3D web server for accurate multiple protein sequence and structure alignments Nucleic Acids Res., July 1, 2008; 36(suppl_2): W30 - W34. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. C. Theodoropoulou, P. G. Bagos, I. C. Spyropoulos, and S. J. Hamodrakas gpDB: a database of GPCRs, G-proteins, effectors and their interactions Bioinformatics, June 15, 2008; 24(12): 1471 - 1472. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Horan, C. Jang, J. Bailey-Serres, R. Mittler, C. Shelton, J. F. Harper, J.-K. Zhu, J. C. Cushman, M. Gollery, and T. Girke Annotating Genes of Known and Unknown Function by Large-Scale Coexpression Analysis Plant Physiology, May 1, 2008; 147(1): 41 - 57. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Pei, B.-H. Kim, and N. V. Grishin PROMALS3D: a tool for multiple protein sequence and structure alignments Nucleic Acids Res., April 1, 2008; 36(7): 2295 - 2300. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Joenvaara, I. Ritamo, H. Peltoniemi, and R. Renkonen N-Glycoproteomics - An automated workflow approach Glycobiology, April 1, 2008; 18(4): 339 - 349. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Grandison and R. J. Morris Biological pathway kinetic rate constants are scale-invariant Bioinformatics, March 15, 2008; 24(6): 741 - 743. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. N. Wass and M. J. E. Sternberg ConFunc--functional annotation in the twilight zone Bioinformatics, March 15, 2008; 24(6): 798 - 806. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Birzele, G. Csaba, and R. Zimmer Alternative splicing and protein structure evolution Nucleic Acids Res., February 2, 2008; 36(2): 550 - 558. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Cremades, M. Bueno, J. L. Neira, A. Velazquez-Campoy, and J. Sancho Conformational Stability of Helicobacter pylori Flavodoxin: FIT TO FUNCTION AT pH 5 J. Biol. Chem., February 1, 2008; 283(5): 2883 - 2895. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Papaloukas, E. Granseth, H. Viklund, and A. Elofsson Estimating the length of transmembrane helices using Z-coordinate predictions Protein Sci., February 1, 2008; 17(2): 271 - 278. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker and M. Albrecht FunSimMat: a comprehensive functional similarity database Nucleic Acids Res., January 11, 2008; 36(suppl_1): D434 - D439. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Birzele, J. E. Gewehr, and R. Zimmer AutoPSI: a database for automatic structural classification of protein sequences and structures Nucleic Acids Res., January 11, 2008; 36(suppl_1): D398 - D401. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Zhang, O. Crasta, S. Cammer, R. Will, R. Kenyon, D. Sullivan, Q. Yu, W. Sun, R. Jha, D. Liu, et al. An emerging cyberinfrastructure for biodefense pathogen and pathogen host data Nucleic Acids Res., January 11, 2008; 36(suppl_1): D884 - D891. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Rattei, P. Tischler, R. Arnold, F. Hamberger, J. Krebs, J. Krumsiek, B. Wachinger, V. Stumpflen, and W. Mewes SIMAP structuring the network of protein similarities Nucleic Acids Res., January 11, 2008; 36(suppl_1): D289 - D292. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Donitz, B. Goemann, M. Lize, H. Michael, N. Sasse, E. Wingender, and A. P. Potapov EndoNet: an information resource about regulatory networks of cell-to-cell communication Nucleic Acids Res., January 11, 2008; 36(suppl_1): D689 - D694. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Ruan, H. Li, Z. Chen, A. Coghlan, L. J. M. Coin, Y. Guo, J.-K. Heriche, Y. Hu, K. Kristiansen, R. Li, et al. TreeFam: 2008 Update Nucleic Acids Res., January 11, 2008; 36(suppl_1): D735 - D740. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Jung, M. Staton, T. Lee, A. Blenda, R. Svancara, A. Abbott, and D. Main GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data Nucleic Acids Res., January 11, 2008; 36(suppl_1): D1034 - D1040. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Brylinski and J. Skolnick A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation PNAS, January 8, 2008; 105(1): 129 - 134. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. D. Rawlings, F. R. Morton, C. Y. Kok, J. Kong, and A. J. Barrett MEROPS: the peptidase database Nucleic Acids Res., January 1, 2008; 36(suppl_1): D320 - D325. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Dinkel and H. Sticht A computational strategy for the prediction of functional linear peptide motifs in proteins Bioinformatics, December 15, 2007; 23(24): 3297 - 3303. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. D. Karp, I. M. Keseler, A. Shearer, M. Latendresse, M. Krummenacker, S. M. Paley, I. Paulsen, J. Collado-Vides, S. Gama-Castro, M. Peralta-Gil, et al. Multidimensional annotation of the Escherichia coli K-12 genome Nucleic Acids Res., December 3, 2007; 35(22): 7577 - 7590. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. van der Kaaij, S. Janecek, M. J. E. C. van der Maarel, and L. Dijkhuizen Phylogenetic and biochemical characterization of a novel cluster of intracellular fungal {alpha}-amylase enzymes Microbiology, December 1, 2007; 153(12): 4003 - 4015. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Rodriguez, T. Bompada, M. Syed, P. K. Shah, and N. Maltsev Evolutionary analysis of enzymes using Chisel Bioinformatics, November 15, 2007; 23(22): 2961 - 2968. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Xia, M. Manning, H. Hesham, Q. Lin, C. Bystroff, and W. Colon Identifying the subproteome of kinetically stable proteins via diagonal 2D SDS/PAGE PNAS, October 30, 2007; 104(44): 17329 - 17334. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Huhne, F.-T. Koch, and J. Suhnel A comparative view at comprehensive information resources on three-dimensional structures of biological macro-molecules Brief Funct Genomic Proteomic, October 23, 2007; (2007) elm020v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Ogata and J.-M. Claverie Unique genes in giant viruses: Regular substitution pattern and anomalously short size Genome Res., September 1, 2007; 17(9): 1353 - 1361. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. D. Harrington, A. H. Singh, T. Doerks, I. Letunic, C. von Mering, L. J. Jensen, J. Raes, and P. Bork Quantitative assessment of protein function prediction from metagenomics shotgun sequences PNAS, August 28, 2007; 104(35): 13913 - 13918. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Kong, Y. Zhang, Z.-Q. Ye, X.-Q. Liu, S.-Q. Zhao, L. Wei, and G. Gao CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine Nucleic Acids Res., July 13, 2007; 35(suppl_2): W345 - W349. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Al-Shahrour, P. Minguez, J. Tarraga, I. Medina, E. Alloza, D. Montaner, and J. Dopazo FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments Nucleic Acids Res., July 13, 2007; 35(suppl_2): W91 - W96. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Pei, B.-H. Kim, M. Tang, and N. V. Grishin PROMALS web server for accurate multiple protein sequence alignments Nucleic Acids Res., July 13, 2007; 35(suppl_2): W649 - W652. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-C. Chen, Y.-S. Lo, W.-C. Hsu, and J.-M. Yang 3D-partner: a web server to infer interacting partners and binding models Nucleic Acids Res., July 13, 2007; 35(suppl_2): W561 - W567. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Pagni, V. Ioannidis, L. Cerutti, M. Zahn-Zabal, C. V. Jongeneel, J. Hau, O. Martin, D. Kuznetsov, and L. Falquet MyHits: improvements to an interactive resource for analyzing protein sequences Nucleic Acids Res., July 13, 2007; 35(suppl_2): W433 - W437. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. Huang, B. T. Sherman, Q. Tan, J. Kir, D. Liu, D. Bryant, Y. Guo, R. Stephens, M. W. Baseler, H. C. Lane, et al. DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists Nucleic Acids Res., July 13, 2007; 35(suppl_2): W169 - W175. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Iwasaki and T. Takagi Reconstruction of highly heterogeneous gene-content evolution across the three domains of life Bioinformatics, July 1, 2007; 23(13): i230 - i239. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Roma, G. Cobellis, P. Claudiani, F. Maione, P. Cruz, G. Tripoli, M. Sardiello, I. Peluso, and E. Stupka A novel view of the transcriptome revealed from gene trapping in mouse embryonic stem cells Genome Res., July 1, 2007; 17(7): 1051 - 1060. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Contreras-Moreira, P.-A. Branger, and J. Collado-Vides TFmodeller: comparative modelling of protein DNA complexes Bioinformatics, July 1, 2007; 23(13): 1694 - 1696. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Bromberg and B. Rost SNAP: predict effect of non-synonymous polymorphisms on function Nucleic Acids Res., June 28, 2007; 35(11): 3823 - 3835. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. E. Suzek, H. Huang, P. McGarvey, R. Mazumder, and C. H. Wu UniRef: comprehensive and non-redundant UniProt reference clusters Bioinformatics, May 15, 2007; 23(10): 1282 - 1288. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q.-H. Zhu, A.-Y. Guo, G. Gao, Y.-F. Zhong, M. Xu, M. Huang, and J. Luo DPTF: a database of poplar transcription factors Bioinformatics, May 15, 2007; 23(10): 1307 - 1308. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Flygare and S. Karlsson Diamond-Blackfan anemia: erythropoiesis lost in translation Blood, April 15, 2007; 109(8): 3152 - 3154. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker, C. Huthmacher, F. Ramirez, T. Lengauer, and M. Albrecht Functional evaluation of domain domain interactions and human protein interaction networks Bioinformatics, April 1, 2007; 23(7): 859 - 865. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Pei and N. V. Grishin PROMALS: towards accurate multiple sequence alignments of distantly related proteins Bioinformatics, April 1, 2007; 23(7): 802 - 808. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bateman and R. D. Finn SCOOP: a simple method for identification of novel protein superfamily relationships Bioinformatics, April 1, 2007; 23(7): 809 - 814. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Tress, P. L. Martelli, A. Frankish, G. A. Reeves, J. J. Wesselink, C. Yeats, P. l. Olason, M. Albrecht, H. Hegyi, A. Giorgetti, et al. The implications of alternative splicing in the ENCODE protein complement PNAS, March 27, 2007; 104(13): 5495 - 5500. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. J. Mulder, R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, D. Binns, P. Bork, V. Buillard, L. Cerutti, R. Copley, et al. New developments in the InterPro database Nucleic Acids Res., January 12, 2007; 35(suppl_1): D224 - D228. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Barthelmes, C. Ebeling, A. Chang, I. Schomburg, and D. Schomburg BRENDA, AMENDA and FRENDA: the enzyme information system in 2007 Nucleic Acids Res., January 12, 2007; 35(suppl_1): D511 - D514. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Y. Galperin The Molecular Biology Database Collection: 2007 update Nucleic Acids Res., January 12, 2007; 35(suppl_1): D3 - D4. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kulikova, R. Akhtar, P. Aldebert, N. Althorpe, M. Andersson, A. Baldwin, K. Bates, S. Bhattacharyya, L. Bower, P. Browne, et al. EMBL Nucleotide Sequence Database in 2006 Nucleic Acids Res., January 12, 2007; 35(suppl_1): D16 - D20. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Brugger The Sulfolobus database Nucleic Acids Res., January 12, 2007; 35(suppl_1): D413 - D415. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Zhang, Y. Zhang, J. Adachi, J. V. Olsen, R. Shi, G. de Souza, E. Pasini, L. J. Foster, B. Macek, A. Zougman, et al. MAPU: Max-Planck Unified database of organellar, cellular, tissue and body fluid proteomes Nucleic Acids Res., January 12, 2007; 35(suppl_1): D771 - D779. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Wilson, M. Madera, C. Vogel, C. Chothia, and J. Gough The SUPERFAMILY database in 2007: families and functions Nucleic Acids Res., January 12, 2007; 35(suppl_1): D308 - D313. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Portugaly, N. Linial, and M. Linial EVEREST: a collection of evolutionary conserved protein domains Nucleic Acids Res., January 12, 2007; 35(suppl_1): D241 - D246. [Abstract] [Full Text] [PDF] |
||||











