Nucleic Acids Research, 2002, Vol. 30, No. 1 245-248
© 2002 Oxford University Press
The Protein Data Bank: unifying the archive
Rutgers, The State University of New Jersey, Department of Chemistry, 610 Taylor Road, Piscataway, NJ 08854-8087, USA, 1National Institute of Standards and Technology, Route 270, Quince Orchard Road, Gaithersburg, MD 20899, USA, 2San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0505, USA, 3Department of Pharmacology, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0500, USA and 4The Burnham Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
Received September 25, 2001; Accepted October 1, 2001.
| ABSTRACT |
|---|
|
|
|---|
The Protein Data Bank (PDB; http://www.pdb.org/) is the single worldwide archive of structural data of biological macromolecules. This paper describes the progress that has been made in validating all data in the PDB archive and in releasing a uniform archive for the community. We have now produced a collection of mmCIF data files for the PDB archive (ftp://beta.rcsb.org/pub/pdb/uniformity/data/mmCIF/). A utility application that converts the mmCIF data files to the PDB format (called CIFTr) has also been released to provide support for existing software.
| INTRODUCTION |
|---|
|
|
|---|
The PDB is the single archive of biological macromolecular structures (1,2). From July 1, 2000 to June 30, 2001, a total of 3148 structures were deposited with the PDB. Full data processing of entries by the Research Collaboratory for Structural Bioinformatics (RCSB), including author revisions, averages <2 weeks. The number and complexity of the entries continues to increase. In 1999, 2670 structures were deposited containing 1 million residues; in 2000, 2995 structures were deposited containing 1.3 million residues. This is a 30% increase in residues per year.
The access and distribution of the archival data is through the primary Web site at UCSD and through mirrors located throughout the world (Table 1). The PDB receives an average of 115 000 hits per day on the primary Web site alone. As of September 19, 2001, there are more than 16 000 structures in the PDB. The demographics of the current holdings are shown at http://www.rcsb.org/pdb/holdings.html.
|
Although we have continued to improve our query capabilities, the lack of uniform data, a result of the evolution of the format and content of the PDB over a 30 year period, necessarily limits our ability to provide reliable searches and the communitys ability to perform quantitative science. In order to improve the querying capabilities of the PDB, we first addressed the uniformity of key data records for all entries in the PDB (3). Several records were targeted for this type of remediation: macromolecular names and synonyms, source organism, R-factor, resolution, enzyme names and classification, and primary citation. The results of record-wise uniformity processing are stored in the PDB relational database. This information is available as database output, in PDB reports and in Structure Explorer pages.
The practical effect of this type of data processing is that it makes queries on these records more reliable. For example, it is now possible to perform complex queries on enzymes and enzyme classes in a way that was not possible with the archival data. However, while this type of uniformity processing improves PDB queries, it does not produce improved files.
Changing an existing archive such as the PDB to comply with evolving data and nomenclature standards and imposing consistency constraints in data representation presents a variety of problems. Such changes, no matter how well intended, may corrupt references to a wide variety of published and Web-accessible work. However, from the perspective of a new PDB user who is not connected with the archives rich history, the state of the PDB with respect to uniformity is difficult to reconcile. To participate in current and future scientific challenges, the PDB must advance to a level of data quality that facilitates systematic archive-wide analyses and integration with other biological and structural databases. The path to attain this level of data quality and maintain historical continuity is a difficult one with many trade-offs. We describe here how we have addressed this difficult issue and what we have done to create a new set of uniform PDB entries.
| DATA PROCESSING OF NEW ENTRIES |
|---|
|
|
|---|
In order to provide the community with high quality data, the RCSB has developed a number of tools that support the deposition and processing of X-ray and NMR structures. For depositing structures, the integrated web-based user interface the AutoDep Input Tool (ADIT, http://deposit.pdb.org/adit/) takes data from an uploaded file and presents a Web-based editor to modify and make additions to an entry. Data processing involves checking various aspects of the structure and the data collected through ADIT. Deposited information is converted to mmCIF representation and is subsequently processed by PDB validation programs. Over the years, many programs and procedures have been developed to diagnose the errors in PDB files (47). These programs have allowed authors to detect errors and correct them prior to deposition in the PDB. The PDB has incorporated many of these methods and has developed a series of procedures to review and validate structures.
A skilled annotator reviews the output of these validation checks. A distilled summary report of the validation diagnostics is forwarded to depositors. Since the author is the most knowledgeable about his/her own structure, the PDB collaborates with the author to help ensure that the structure that is ultimately released to the public is the best possible representation of the results of the experiment.
The PDB validation report summarizes results of the following checks: stereochemistry, close contacts in asymmetric unit and unit cell, occupancy, sequence in PDB SEQRES records and coordinates, distant waters, experimental data [SFCHECK (8)], comparison with standard values (911).
This validation software allows the user to check the format of coordinate and structure factor files and to perform a variety of validation tests on the structure prior to deposition in the PDB. These checks can be done independently by the user via the Validation Server on the Web at http://deposit.pdb.org/validate/ or by downloading the software from http://deposit.pdb.org/software/. The format precheck and validation steps are also optional steps of the ADIT deposition process.
| REPROCESSING OF LEGACY FILES |
|---|
|
|
|---|
In addressing uniformity issues with the legacy data, we have focused on formatting, nomenclature and sequence structure consistency. The decision to concentrate on these aspects is the result of many discussions with the community of PDB users. These modifications have been made very conservatively and do not change the coordinates of the structural model. To do this we have applied the software developed for primary data processing for the validation and standardization of the 8368 data files released into the archive prior to October 1998. The gain in efficiency in data processing has been largely transferable to the task of reprocessing legacy files. The combination of improved software and the experience gained from 3 years of primary processing has made it possible to attempt a more automated remediation of the legacy data in a batch mode. The classes of errors that have now been remediated are described in the following sections.
Sequence representation
The complete polymer sequence for the macromolecule under study is encoded in PDB SEQRES records as a list of three-letter residue codes. These records are intended to describe the full polymer sequence for the macromolecule or domain for which coordinates are deposited.
In comparing the legacy sequence data with data from sequence databases (12,13), cases were found in which the legacy sequence was incorrect. In most cases, these sequence errors reflect gaps in the model sequence or incompletely modeled residues where residues or side-chains were not experimentally observed. In all of these cases, the sequences were updated with the correct or missing residues. In some instances, two PDB chains were used to represent a single polymer with a residue gap. These sequences were consolidated into single PDB chains.
Sequence/coordinate mismatches
Sequence information can also be derived from PDB coordinate records. Since coordinate data may not be deposited for all of the residues in structure, the PDB SEQRES records are provided to define the full chemical sequence. Even though the coordinate records may not provide complete sequence information, the sequence information in the SEQRES and coordinate records should be consistent. We found 90 cases in the legacy data in which SEQRES and coordinate records were non-corresponding. The majority of these inconsistencies result from labeling residues in the coordinate records missing side-chains as alanines. Only four of the 90 cases could not be reconciled on the basis of a missing side chain.
Atom and ligand nomenclature
The most common problems found in the legacy data are related to the labeling of atoms and ligands. Atom nomenclature problems were found in 3311 (40%) of the legacy files. The labeling of terminal atoms was found to be the most common nomenclature error. Atoms adjacent to a gap of unobserved residues in continuous sequence were most commonly mislabeled as terminal atoms. All errors of this type were automatically corrected.
Labeling of ligand atoms and residues was the second most common nomenclature problem. Ligand atom names were standardized in software to the nomenclature used in the PDB ligand dictionary. This was accomplished by topology matching against the chemical descriptions in the dictionary. New ligand descriptions were created and added to the dictionary where necessary.
Another common nomenclature problem arises from the duplication of atom labels. Redundant atom labels were found in 636 legacy data files. This was most commonly the result of the mislabeling of alternate conformations. In a small number of cases, identical coordinate records were duplicated. All instances of duplicated atom records were resolved.
Stereochemical labeling
Perhaps the most serious class of errors in atom nomenclature is that related to stereochemistry. Errors in chirality were found in 549 legacy files. Only 255 of these cases could be resolved as errors in atom labeling; the remainder represents exceptions to current stereochemical conventions.
| REVISITING RECENTLY PROCESSED ENTRIES |
|---|
|
|
|---|
We also reviewed all the data that had been processed by the RCSB since October 1998. The re-validation of the 3150 files that we processed prior to January 2000 showed five entries with conflicts in sequence information between SEQRES and coordinate records, 162 errors in atom and ligand nomenclature, 19 duplicated atom labels, three errors in stereochemical labeling and 30 terminal atom labeling errors. The largest number of errors was related to ligand atom nomenclature. These resulted from changes in our ligand dictionary, which underwent significant correction and development during 1999. Any remaining errors were undetected by our software or were omissions in our annotation procedures during this period.
The results for files processed after January 2000 show further improvement. In this group of 3569 files, we found 31 errors in atom and ligand nomenclature, one duplicated atom label, three errors in stereochemical labeling and two terminal atom labeling errors. No sequence inconsistencies were detected. All of these errors were corrected and these corrections are described within each entry in the PDB revision records.
| INTEGRATION AND DELIVERY OF UNIFIED DATA |
|---|
|
|
|---|
The final step in this process was the integration of the results of record by record processing with the batch processing of all the entries in the archive. The results of the uniformity and data integration project are being delivered as a collection of mmCIF data files. The data items within these files are described in the PDB exchange data dictionary. This dictionary, which includes the data items in the standard mmCIF dictionary along with PDB extensions, is available at the PDB mmCIF Resource site (http://deposit.pdb.org/mmcif/).
The mmCIF data files are available on the PDB beta-ftp site for all legacy data and for current files deposited and processed with the PDB (ftp://beta.rcsb.org/pub/pdb/uniformity/data/mmCIF/). The beta release of these data files is to allow users to evaluate and comment. PDB will continue to correct and improve the uniformity of these data in response to user input.
| SUPPORT FOR THE PDB AND OTHER FORMATS |
|---|
|
|
|---|
In recognition that many software applications require the PDB format, we have provided a software tool (CIFTr) that translates the mmCIF into PDB format. The tool provides options that permit users to select their particular nomenclature preference. For instance, it is possible to select between the nomenclature used when the file was originally released and the nomenclature resulting from uniformity processing. In the future, CIFTr will provide translation to other file formats such as XML.
CIFTr is available for download from http://deposit.pdb.org/software/ for SGI, Linux, Alpha and SUN platforms.
| THE FUTURE |
|---|
|
|
|---|
With the mmCIFs from the data uniformity project as a base, we will now examine the data items that were not included in our initial uniformity project. In particular we will examine the details of experimental data collection and refinement that are currently embedded in unstructured REMARK records in the older PDB files. As much as possible we will attempt to extract information from the text of these remarks and populate the corresponding mmCIF data items.
We are now redesigning the underlying PDB core relational database to take advantage of the new uniform and self-consistent data files. Owing to the greater internal consistency within the mmCIF datasets, the new database implementation will provide the ability to construct queries that span the range of structural detail from biological assembly to individual atoms.
Questions and comments about the PDB should be sent to info{at}rcsb.org
| ACKNOWLEDGEMENTS |
|---|
The PDB team consists of the authors listed and Peter Arzberger, Bryan Banister, Tammy Battistuz, Kyle Burkhardt, Li Chen, Victoria Colflesh, Nita Deshpande, Phoebe Fagan, Ward Fleri, Michael Gribskov, Diane Hancock, Lisa Iype, Brad Kroeger, Jessica Marvin, David Padilla, Gnanesh Patel, Bohdan Schneider, Thomas Solomon, Lynn Ten Eyck, Michael Tung, Rosalina Valera and Christine Zardecki. We acknowledge the work of Dietmar Schomburg (Institute of Biochemistry at the University of Cologne), project leader of the BRENDA database, for enzyme classification remediation, the comments and corrections that have been contributed by colleagues at the NCBI, the EBI, the CCDC and the many users of the PDB. We also acknowledge the support of Angela Loh and Compaq Computer Corporation for the generous gift of hardware that has help to support the computational requirements of this project. This work is supported by grants from the National Science Foundation, the Office of Biological and Environmental Research at the Department of Energy, and two units of the National Institutes of Health: the National Institute of General Medical Sciences and the National Library of Medicine.
| FOOTNOTES |
|---|
* To whom correspondence should be addressed. Tel: +1 732 445 4667; Fax: +1 732 445 4320; Email: berman{at}rcsb.rutgers.edu
| REFERENCES |
|---|
|
|
|---|
-
1 Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N. and Bourne,P.E. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235242.
2 Bernstein,F.C., Koetzle,T.F., Williams,G.J., Meyer,E.E., Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) Protein Data Bank: a computer-based archival file for macromolecular structures. J. Mol. Biol., 112, 535542.[Web of Science][Medline]
3 Bhat,T.N., Bourne,P., Feng,Z., Gilliland,G., Jain,S., Ravichandran,V., Schneider,B., Schneider,K., Thanki,N., Weissig,H., Westbrook,J. and Berman,H.M. (2001) The PDB data uniformity project. Nucleic Acids Res., 29, 214218.
4 Laskowski,R.A., McArthur,M.W., Moss,D.S. and Thornton,J.M. (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst., 26, 283291.
5 Laskowski,R.A., Rullmann,J.A., MacArthur,M.W., Kaptein,R. and Thornton,J.M. (1996) AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J. Biomol. NMR, 8, 477486.[Web of Science][Medline]
6 Hooft,R.W., Sander,C. and Vriend,G. (1996) Verification of protein structures: side-chain planarity. J. Appl. Crystallogr., 29, 714716.
7 Hooft,R.W., Vriend,G., Sander,C. and Abola,E.E. (1996) Errors in protein structures. Nature, 381, 272.[Medline]
8 Vaguine,A.A., Richelle,J. and Wodak,S.J. (1999) SFCHECK: a unified set of procedures for evaluating the quality of macromolecular structurefactor data and their agreement with the atomic model. Acta Crystallogr., D55, 191205.
9 Engh,R.A. and Huber,R. (1991) Accurate bond and angle parameters for X-ray protein structure refinement. Acta Crystallogr., A47, 392400.
10 Gelbin,A., Schneider,B., Clowney,L., Hsieh,S.-H., Olson,W.K. and Berman,H.M. (1996) Geometric parameters in nucleic acids: sugar and phosphate constituents. J. Am. Chem. Soc., 118, 519528.
11 Clowney,L., Jain,S.C., Srinivasan,A.R., Westbrook,J., Olson,W.K. and Berman,H.M. (1996) Geometric parameters in nucleic acids: nitrogenous bases. J. Am. Chem. Soc., 118, 509518.
12 Bairoch,A. and Boeckmann,B. (1994) The SWISS-PROT protein sequence databank: current status. Nucleic Acids Res., 22, 35783580.
13 Benson,D.A., Karsch-Mizrachi,I., Lipman,D.J., Ostell,J., Rapp,B.A. and Wheeler,D.L. (2000) GenBank. Nucleic Acids Res., 28, 1518. Updated article in this issue: Nucleic Acids Res. (2002), 30, 1720.
This article has been cited by other articles:
![]() |
P. Bjorkholm and E. L. L. Sonnhammer Comparative analysis and unification of domain-domain interaction networks Bioinformatics, November 15, 2009; 25(22): 3020 - 3025. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Song, S. J. Lim, and J. C. Tong Recent advances in computer-aided drug design Brief Bioinform, September 1, 2009; 10(5): 579 - 591. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Manning, A. Aggarwal, K. Gao, and G. Tucker-Kellogg Scaling the walls of discovery: using semantic metadata for integrative problem solving Brief Bioinform, March 1, 2009; 10(2): 164 - 176. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. J. Jensen, M. Kuhn, M. Stark, S. Chaffron, C. Creevey, J. Muller, T. Doerks, P. Julien, A. Roth, M. Simonovic, et al. STRING 8--a global view on proteins and their functional interactions in 630 organisms Nucleic Acids Res., January 1, 2009; 37(suppl_1): D412 - D416. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. W. Brandt, J. Heringa, and J. A. M. Leunissen SEQATOMS: a web tool for identifying missing regions in PDB in sequence context Nucleic Acids Res., July 1, 2008; 36(suppl_2): W255 - W259. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. R. Jefferson, T. P. Walsh, T. J. Roberts, and G. J. Barton SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein-Protein Interactions Nucleic Acids Res., January 12, 2007; 35(suppl_1): D580 - D589. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Douguet, H.-C. Chen, A. Tovchigrechko, and I. A. Vakser DOCKGROUND resource for studying protein-protein interfaces Bioinformatics, November 1, 2006; 22(21): 2612 - 2618. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. H. Melner, A. L. Haas, J. M. Klein, A. R. Brash, W. E. Boeglin, S. K. NagDas, V. P. Winfrey, and G. E. Olson Demonstration of Ubiquitin Thiolester Formation of UBE2Q2 (UBCi), a Novel Ubiquitin-Conjugating Enzyme with Implantation Site-Specific Expression Biol Reprod, September 1, 2006; 75(3): 395 - 406. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Birkland and G. Yona BIOZON: a hub of heterogeneous biological data Nucleic Acids Res., January 1, 2006; 34(suppl_1): D235 - D242. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Pieper, N. Eswar, F. P. Davis, H. Braberg, M. S. Madhusudhan, A. Rossi, M. Marti-Renom, R. Karchin, B. M. Webb, D. Eramian, et al. MODBASE: a database of annotated comparative protein structure models and associated resources Nucleic Acids Res., January 1, 2006; 34(suppl_1): D291 - D295. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Thompson, S. R. Holbrook, K. Katoh, P. Koehl, D. Moras, E. Westhof, and O. Poch MAO: a Multiple Alignment Ontology for nucleic acid and protein sequences Nucleic Acids Res., July 25, 2005; 33(13): 4164 - 4171. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Wang and R. L. Dunbrack Jr PISCES: recent improvements to a PDB sequence culling server Nucleic Acids Res., July 1, 2005; 33(suppl_2): W94 - W98. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Liu, F. Mao, J.-t. Guo, B. Yan, P. Wang, Y. Qu, and Y. Xu Quantitative evaluation of protein-DNA interactions using an optimized knowledge-based potential Nucleic Acids Res., January 26, 2005; 33(2): 546 - 558. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-H. Sheu, D. R. Lancia Jr, K. H. Clodfelter, M. R. Landon, and S. Vajda PRECISE: a Database of Predicted and Consensus Interaction Sites in Enzymes Nucleic Acids Res., January 1, 2005; 33(suppl_1): D206 - D211. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Deshpande, K. J. Addess, W. F. Bluhm, J. C. Merino-Ott, W. Townsend-Merino, Q. Zhang, C. Knezevich, L. Xie, L. Chen, Z. Feng, et al. The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema Nucleic Acids Res., January 1, 2005; 33(suppl_1): D233 - D237. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. R. Chance, A. Fiser, A. Sali, U. Pieper, N. Eswar, G. Xu, J. E. Fajardo, T. Radhakannan, and N. Marinkovic High-Throughput Computational and Experimental Techniques in Structural Genomics Genome Res., October 1, 2004; 14(10b): 2145 - 2154. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. J. Salerno, S. M. Seaver, B. R. Armstrong, and I. Radhakrishnan MONSTER: inferring non-covalent interactions in macromolecular structures from atomic coordinate data Nucleic Acids Res., July 1, 2004; 32(suppl_2): W566 - W568. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Mirkovic, M. A. Marti-Renom, B. L. Weber, A. Sali, and A. N. A. Monteiro Structure-Based Assessment of Missense Mutations in Human BRCA1: Implications for Breast and Ovarian Cancer Predisposition Cancer Res., June 1, 2004; 64(11): 3790 - 3797. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. W. Mewes, C. Amid, R. Arnold, D. Frishman, U. Guldener, G. Mannhaupt, M. Munsterkotter, P. Pagel, N. Strack, V. Stumpflen, et al. MIPS: analysis and annotation of proteins from whole genomes Nucleic Acids Res., January 1, 2004; 32(90001): D41 - 44. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Espadaler, N. Fernandez-Fuentes, A. Hermoso, E. Querol, F. X. Aviles, M. J. E. Sternberg, and B. Oliva ArchDB: automated protein loop classification as a tool for structural genomics Nucleic Acids Res., January 1, 2004; 32(90001): D185 - 188. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. E. Bourne, K. J. Addess, W. F. Bluhm, L. Chen, N. Deshpande, Z. Feng, W. Fleri, R. Green, J. C. Merino-Ott, W. Townsend-Merino, et al. The distribution and query systems of the RCSB Protein Data Bank Nucleic Acids Res., January 1, 2004; 32(90001): D223 - 225. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Andreeva, D. Howorth, S. E. Brenner, T. J. P. Hubbard, C. Chothia, and A. G. Murzin SCOP database in 2004: refinements integrate structure and sequence family data Nucleic Acids Res., January 1, 2004; 32(90001): D226 - 229. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Salwinski, C. S. Miller, A. J. Smith, F. K. Pettit, J. U. Bowie, and D. Eisenberg The Database of Interacting Proteins: 2004 update Nucleic Acids Res., January 1, 2004; 32(90001): D449 - 451. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bhaduri and R. Sowdhamini A genome-wide survey of human tyrosine phosphatases Protein Eng. Des. Sel., December 1, 2003; 16(12): 881 - 888. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. C. Saraf and C. D. Maranas Using a residue clash map to functionally characterize protein recombination hybrids Protein Eng. Des. Sel., December 1, 2003; 16(12): 1025 - 1034. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Camon, M. Magrane, D. Barrell, D. Binns, W. Fleischmann, P. Kersey, N. Mulder, T. Oinn, J. Maslen, A. Cox, et al. The Gene Ontology Annotation (GOA) Project: Implementation of GO in SWISS-PROT, TrEMBL, and InterPro Genome Res., April 1, 2003; 13(4): 662 - 672. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rudd, H.-W. Mewes, and K. F.X. Mayer Sputnik: a database platform for comparative plant genomics Nucleic Acids Res., January 1, 2003; 31(1): 128 - 132. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Weng, Q. Dong, R. Balakrishnan, K. Christie, M. Costanzo, K. Dolinski, S. S. Dwight, S. Engel, D. G. Fisk, E. Hong, et al. Saccharomyces Genome Database (SGD) provides biochemical and structural information for budding yeast proteins Nucleic Acids Res., January 1, 2003; 31(1): 216 - 218. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. D. Bader, D. Betel, and C. W. V. Hogue BIND: the Biomolecular Interaction Network Database Nucleic Acids Res., January 1, 2003; 31(1): 248 - 250. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-K. Ng, Z. Zhang, S.-H. Tan, and K. Lin InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes Nucleic Acids Res., January 1, 2003; 31(1): 251 - 254. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. H. Wu, L.-S. L. Yeh, H. Huang, L. Arminski, J. Castro-Alvear, Y. Chen, Z. Hu, P. Kourtesis, R. S. Ledley, B. E. Suzek, et al. The Protein Information Resource Nucleic Acids Res., January 1, 2003; 31(1): 345 - 347. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Ivanciuc, C. H. Schein, and W. Braun SDAP: database and computational tools for allergenic proteins Nucleic Acids Res., January 1, 2003; 31(1): 359 - 362. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. V. Kriventseva, F. Servant, and R. Apweiler Improvements to CluSTr: the database of SWISS-PROT+TrEMBL protein clusters Nucleic Acids Res., January 1, 2003; 31(1): 388 - 389. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ikeda, M. Arai, T. Okuno, and T. Shimizu TMPDB: a database of experimentally-characterized transmembrane topologies Nucleic Acids Res., January 1, 2003; 31(1): 406 - 409. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. A. Buchan, S. C. G. Rison, J. E. Bray, D. Lee, F. Pearl, J. M. Thornton, and C. A. Orengo Gene3D: structural assignments for the biologist and bioinformaticist alike Nucleic Acids Res., January 1, 2003; 31(1): 469 - 473. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. L. A. K. Chiu, C. N. Sze, N. T. Ma, L. F. Chiu, C. W. Leung, and S. C. F. Au-Yeung NTDB: Thermodynamic Database for Nucleic Acids, Version 2.0 Nucleic Acids Res., January 1, 2003; 31(1): 483 - 485. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Chiang, E. C. Meng, C. C. Huang, T. E. Ferrin, and P. C. Babbitt The Structure Superposition Database Nucleic Acids Res., January 1, 2003; 31(1): 505 - 510. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. L. Moore and C. D. Maranas eCodonOpt: a systematic computational framework for optimizing codon usage in directed evolution experiments Nucleic Acids Res., June 1, 2002; 30(11): 2407 - 2416. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Wang, J. B. Anderson, J. Chen, L. Y. Geer, S. He, D. I. Hurwitz, C. A. Liebert, T. Madej, G. H. Marchler, A. Marchler-Bauer, et al. MMDB: Entrez's 3D-structure database Nucleic Acids Res., January 1, 2002; 30(1): 249 - 252. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Lo Conte, S. E. Brenner, T. J. P. Hubbard, C. Chothia, and A. G. Murzin SCOP database in 2002: refinements accommodate structural genomics Nucleic Acids Res., January 1, 2002; 30(1): 264 - 267. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Goto, Y. Okuno, M. Hattori, T. Nishioka, and M. Kanehisa LIGAND: database of chemical compounds and reactions in biological pathways Nucleic Acids Res., January 1, 2002; 30(1): 402 - 404. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






