Skip Navigation

Nucleic Acids Research 2005 33(Database Issue):D247-D251; doi:10.1093/nar/gki024
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (296K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Pearl, F.
Right arrow Articles by Orengo, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pearl, F.
Right arrow Articles by Orengo, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2005, Vol. 33, Database issue D247-D251
© 2005, the authors
Nucleic Acids Research, Vol. 33, Database issue © Oxford University Press 2005; all rights reserved

The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis

Frances Pearl, Annabel Todd, Ian Sillitoe, Mark Dibley, Oliver Redfern, Tony Lewis, Christopher Bennett, Russell Marsden, Alistair Grant, David Lee*, Adrian Akpor, Michael Maibaum, Andrew Harrison, Timothy Dallman, Gabrielle Reeves, Ilhem Diboun, Sarah Addou, Stefano Lise, Caroline Johnston, Antonio Sillero, Janet Thornton1 and Christine Orengo

Biochemistry and Molecular Biology Department, University College London, University of London, Gower Street, London WC1E 6BT, UK and 1 EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

* To whom correspondence should be addressed. Tel: +44 20 7679 3890; Fax: +44 20 7679 7193; Email: dlee{at}biochem.ucl.ac.uk
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors

Received September 15, 2004; Revised and Accepted September 21, 2004


    ABSTRACT
 TOP
 ABSTRACT
 DESCRIPTION OF THE CATH...
 IMPROVED CLASSIFICATION...
 THE CATH SERVER
 REFERENCES
 
The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath/) currently contains 43 229 domains classified into 1467 superfamilies and 5107 sequence families. Each structural family is expanded with sequence relatives from GenBank and completed genomes, using a variety of efficient sequence search protocols and reliable thresholds. This extended CATH protein family database contains 616 470 domain sequences classified into 23 876 sequence families. This results in the significant expansion of the CATH HMM model library to include models built from the CATH sequence relatives, giving a 10% increase in coverage for detecting remote homologues. An improved Dictionary of Homologous superfamilies (DHS) (http://www.biochem.ucl.ac.uk/bsm/dhs/) containing specific sequence, structural and functional information for each superfamily in CATH considerably assists manual validation of homologues. Information on sequence relatives in CATH superfamilies, GenBank and completed genomes is presented in the CATH associated DHS and Gene3D resources. Domain partnership information can be obtained from Gene3D (http://www.biochem.ucl.ac.uk/bsm/cath/Gene3D/). A new CATH server has been implemented (http://www.biochem.ucl.ac.uk/cgi-bin/cath/CathServer.pl) providing automatic classification of newly determined sequences and structures using a suite of rapid sequence and structure comparison methods. The statistical significance of matches is assessed and links are provided to the putative superfamily or fold group to which the query sequence or structure is assigned.


    DESCRIPTION OF THE CATH HIERARCHY AND CURRENT POPULATION STATISTICS
 TOP
 ABSTRACT
 DESCRIPTION OF THE CATH...
 IMPROVED CLASSIFICATION...
 THE CATH SERVER
 REFERENCES
 
The CATH database is a hierarchical classification of domains into sequence- and structure-based families and fold groups. Table 1 shows the population of the latest release of CATH (Version 2.5.1, released January 2004). In the lowest level of the hierarchy, sequences are clustered according to significant sequence similarity (35% identity and above, the S-Level). At higher levels, domains are grouped according to whether they share significant sequence, structural and/or functional similarity (homologous superfamilies, H-Level) or just structural similarity (fold or topology group, the T-level). Fold groups sharing similar architectures, i.e. similarities in the arrangements of their secondary structures regardless of connectivity are then merged into the common architectures (the A-Level). At the top of the hierarchy, domains are clustered depending on their class, i.e. the percentage of {alpha}–helices or ß-strands (the C-Level).


View this table:
[in this window]
[in a new window]
 
Table 1. Populations of the different levels in the CATH hierarchy

 

    IMPROVED CLASSIFICATION PROTOCOLS
 TOP
 ABSTRACT
 DESCRIPTION OF THE CATH...
 IMPROVED CLASSIFICATION...
 THE CATH SERVER
 REFERENCES
 
Below we describe some new CATH associated resources and protocols that increase the speed and reliability of classifying newly determined protein structures in the CATH database.

Validation of homologues using the CATH dictionary of homologous Superfamilies (DHS)
The CATH associated Dictionary of Homologous Superfamilies (DHS) (http://www.biochem.ucl.ac.uk/bsm/dhs/) was established in 1997 (1) and contains a variety of sequence, structural and functional information for each superfamily in CATH. It was updated recently for CATH version 2.5.1, which contains 1467 homologous superfamilies, 334 of which are populated with three or more remote homologues (<35% sequence identity). The DHS contains information on all the pairwise sequence similarities and structural similarities for all pairs of relatives in each superfamily. Sequence similarity is recorded by sequence identity and E-value. Structural similarity is recorded by pairwise SSAP score (2) and also, by E-values determined against a distribution of scores obtained by comparing all non-redundant structures with each other.

Multiple structure alignments are derived for structurally coherent subgroups of relatives, having a pairwise SSAP score of >85 against all relatives in the subgroup. These are generated using the CORA algorithm (3) and displayed using CORAplot (3). The current DHS contains 671 structural alignments from 416 superfamilies. Highly conserved sequence positions, which may be associated with functionally important sites, are highlighted.

Two new methods have been devised to illustrate the degree of structural divergence across the superfamily. Both exploit a multiple structure alignment to identify equivalent secondary structures across the superfamily and inserted secondary structures. Plots give information on highly conserved secondary structures that are diagnostic for the particular superfamily and on the degree of structural embellishment occurring in diverse relatives. Putative homologues to a particular CATH superfamily can be aligned against structural relatives in order to determine whether their structural characteristics fall within the range of structural diversity observed across the superfamily. Information on the population of the superfamily is also provided so that users can gauge how well the superfamily has been sampled to date.

Functional annotations are also provided for each superfamily in the DHS by recruiting relevant functional data from the Protein Data Bank (PDB) (4), GenBank (5), ENZYME (6), KEGG (7) and Gene Ontology (8) databases. The more than 10-fold expansion in the extended CATH database (from 43 299 CATH structural domain sequences to 616 470 by including related GenBank sequences and genome sequences) has significantly increased the amount of functional data available for a particular superfamily.

Expansion in the functional information together with more informative descriptions of structural variability in each CATH superfamily considerably assists in validating new homologues classified in CATH. Furthermore, links to the DHS are provided for structural matches identified using the CATH server.

Improved detection of remote homologues using an extended CATH-HMM model library
Profile based methods for sequence comparison were developed in the early 1980s and allowed recognition of more distant homologues than pairwise based approaches (9). Benchmarking of several publicly available methods, including those using position-specific scoring matrices and hidden Markov models (HMMs) have been undertaken by several groups (10,11). These approaches used datasets of distant homologues selected from the structural classifications, such as SCOP and CATH, to determine the sensitivity of various profile based methods, e.g. HMMs (12) and PSI-BLAST (13).

We recently used a dataset of remote structural homologues from the CATH database (<35% sequence identity), which had been validated by structure comparison and manual inspection to assess the performance of several HMM based strategies (Strategies for Improved Fold and Superfamily Recognition in Genome Annotation; I. Sillitoe, personal communication). HMMs were built using the SAM-T technology developed by Karplus et al. (14). A total of 23 876 HMM models were built for representative sequences from each sequence family in the extended CATH database (containing 616 470 domain sequences). The extended model library gives a 10% increase in coverage for remote homologue detection compared to the standard CATH HMM model library, with a low error rate (0.1%) (I. Sillitoe, personal communication).

It can be seen from Figure 1 that on average, nearly 87% of homologues classified in CATH over the last two years could be recognized using sequence comparison methods, both pairwise sequence alignment and scans against the more sensitive extended CATH-HMM model library.



View larger version (36K):
[in this window]
[in a new window]
 
Figure 1. The proportion (%) of structures from the PDB that have been classified in CATH over the last two years using different sequence comparison or structure comparison methods. Blue segment: PDB sequences with 95% sequence identity or more to existing CATH domains, recognized using SSEARCH. Magenta segment: PDB sequences with 30% sequence identity or more to existing CATH domains, recognized using SSEARCH. Yellow segment: PDB entries that can be assigned to existing CATH superfamilies by scanning the HMM library. Green segment: PDB entries that can be assigned to CATH superfamilies by structure comparisons against CATH representatives using SSAP. Purple segment: PDB entries that can be assigned to CATH fold groups by structure comparisons against CATH representatives using SSAP. Orange segment: PDB entries that do not match any CATH structure and represent novel folds.

 
Expansion of CATH with sequence relatives from completed genomes and domain partnership information
We have recently devised protocols for identifying sequence relatives to CATH superfamilies in completed genomes (15). To date, nearly one million sequences from 150 completed genomes have been scanned against the CATH-HMM model library (15). Between 40 and 60% of sequences or partial sequences from each genome could be assigned to a CATH superfamily. Genome sequences were also scanned against libraries of HMM models from the Pfam database (release 10) (16) in order to extend the domain annotation of each genome sequence and provide more comprehensive information on domain partnerships.

Sequence relatives to CATH superfamilies, identified in this way are displayed in the CATH related DHS and Gene3D resources. Gene3D displays the domain composition of each gene annotated by CATH and Pfam domains. CATH family data in the Gene3D resource has revealed some intriguing insights into the expansion of superfamilies involved in metabolism and regulation in bacterial genomes (17).

Figure 2 shows that the power-law like trends first detected in the structural classifications are mirrored when sequence relatives from the genomes are also included. Considering the structural data alone, it can be seen from Figure 2a that fewer than 10 of the most highly populated folds in the CATH database account for nearly 25% of all superfamilies in the PDB. These folds were previously described as superfolds as they are adopted by many diverse homologous superfamilies (18). When genome sequences are included it can be seen from Figure 2b that the same fold groups dominate the genomes, as they are adopted by nearly 45% of all close sequence families (relatives have 35% or more sequence identity), of known structure, in the genomes.



View larger version (44K):
[in this window]
[in a new window]
 
Figure 2. CATHerine wheels (a) illustrating the distribution of domain structures from the PDB among the different levels in the CATH hierarchy. The three classes are illustrated in colour, mainly {alpha} pink, mainly ß yellow and {alpha}–ß green. The inner wheel corresponds to different architectures in the classification and the outer wheel to different fold groups. Each fold group has been subdivided according to the numbers and populations of different homologous superfamilies adopting that fold. (b) Illustrating the distribution of CATH domains among the sequences from 150 completed genomes, in Gene3D. In this case, the fold groups labelled in the outer circle have been divided according to the number and size of close sequence families within each fold group.

 

    THE CATH SERVER
 TOP
 ABSTRACT
 DESCRIPTION OF THE CATH...
 IMPROVED CLASSIFICATION...
 THE CATH SERVER
 REFERENCES
 
A new protocol has been developed for searching CATH with a newly determined protein structure. Structures submitted to the server (http://www.biochem.ucl.ac.uk/cgi-bin/cath/CathServer.pl) are first processed by the DDMake suite of programs that generate derived data from the PDB coordinate files (e.g. secondary structure data, residue accessibilities and {phi}{psi} data, sequence data in the FASTA format, etc.). The query sequence is scanned against the CATH-HMM model library to identify more remote homologues. Threshold E-values used to recognize homologues are predetermined by benchmarking with validated structural homologues from CATH (I. Sillitoe, personal communication).

If the sequence returns a significant match to any relative in one or more CATH superfamilies, representatives from all close sequence families within those superfamilies are structurally compared with the query structure using the SSAP structure alignment program (2). The top 10 structural matches, sorted in the order of SSAP score are then displayed together with information on the degree of sequence and structural similarity and with links to the CATH page and the DHS page for each CATH superfamily identified. Rasmol images are also provided for the top 10 matches.

Any query structure unmatched by the CATH-HMM library is scanned against a library of representative structures from each close sequence family in CATH using the rapid structure comparison algorithm, CATHEDRAL (19). CATHEDRAL uses a robust statistical framework based on the extreme value distributions observed for random similarities to assess significance. If the query structure significantly matches one or more CATH superfamilies, SSAP comparisons are performed for all sequence representatives in those superfamilies and the top 10 matches are displayed, as before.


    ACKNOWLEDGEMENTS
 
F.P., I.S., M.D., A.G., T.L., A.A. and C.O. all acknowledge the Medical Research Council for their funding. A.T., D.L. and R.M. are currently supported by funding from the National Institutes of Health. G.R., O.R. and T.D. acknowledge support from the Biotechnology and Biological Sciences Research Council, and C.B. acknowledges support from the Wellcome Trust for the research described in this manuscript.


    Notes
 
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use permissions, please contact journals.permissions{at}oupjournals.org.


    REFERENCES
 TOP
 ABSTRACT
 DESCRIPTION OF THE CATH...
 IMPROVED CLASSIFICATION...
 THE CATH SERVER
 REFERENCES
 

  1. Bray,J.E., Todd,A.E., Pearl,F.M., Thornton,J.M. and Orengo,C.A. ( (2000) ) The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues. Protein Eng., , 13, , 153–165.[Abstract/Free Full Text] .

  2. Taylor,W. and Orengo,C. ( (1989) ) Protein structure alignment. J. Mol. Biol., , 208, , 1–22.[CrossRef][Web of Science][Medline] .

  3. Orengo,C. ( (1999) ) CORA—topological fingerprints for protein structural families. Protein Sci., , 8, , 699–715.[Web of Science][Medline] .

  4. Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N. and Bourne,P.E. ( (2000) ) The Protein Data Bank. Nucleic Acids Res., , 28, , 235–242.[Abstract/Free Full Text] .

  5. Benson,D.A., Karsch-Mizrachi,I., Lipman,D.J., Ostell,J. and Wheeler,D.L. ( (2004) ) GenBank: update. Nucleic Acids Res., , 32, , 23–26. .

  6. Bairoch,A. ( (2000) ) The ENZYME database in 2000. Nucleic Acids Res., , 28, , 304–305.[Abstract/Free Full Text] .

  7. Kanehisa,M. and Goto,S. ( (2000) ) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res., , 28, , 27–30.[Abstract/Free Full Text] .

  8. Harris,M.A., Clark,J., Ireland,A., Lomax,J., Ashburner,M., Foulger,R., Eilbeck,K., Lewis,S., Marshall,B., Mungall,C. et al. ( (2004) ) Gene Ontology Consortium. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res., , 32, , D258–D261.[Abstract/Free Full Text] .

  9. Park,J., Karplus,K., Barrett,C., Hughey,R., Haussler,D., Hubbard,T. and Chothia,C. ( (1998) ) Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J. Mol. Biol., , 284, , 1201–1210.[CrossRef][Web of Science][Medline] .

  10. Gough,J., Karplus,K., Hughey,R. and Chothia,C. ( (2001) ) Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J. Mol. Biol., , 313, , 903–919.[CrossRef][Web of Science][Medline] .

  11. Pearl,F.M., Lee,D., Bray,J.E., Buchan,D.W., Shepherd,A.J. and Orengo,C.A. ( (2002) ) The CATH extended protein-family database: providing structural annotations for genome sequences. Protein Sci., , 11, , 233–244.[CrossRef][Web of Science][Medline] .

  12. Eddy,S.R. ( (1996) ) Hidden Markov models. Curr. Opin. Struct. Biol., , 6, , 361–365.[CrossRef][Web of Science][Medline] .

  13. Altschul,S.F., Madden,T.L., Schäffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. ( (1997) ) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., , 25, , 3389–3402.[Abstract/Free Full Text] .

  14. Karplus,K., Barrett,C. and Hughey,R. ( (1998) ) Hidden Markov models for detecting remote protein homologies. Bioinformatics, , 14, , 846–856.[Abstract/Free Full Text] .

  15. Lee,D., Grant,A., Marsden,R. and Orengo,C. ( (2004) ) Identification and distribution of protein families in 120 completed genomes using Gene3D. Proteins, , in press. .

  16. Bateman,A., Coin,L., Durbin,R., Finn,R.D., Hollich,V., Griffiths-Jones,S., Khanna,A., Marshall,M., Moxon,S., Sonnhammer,E.L.L. et al. ( (2004) ) The Pfam protein families database. Nucleic Acids Res., , 32, , D138–D141.[Abstract/Free Full Text] .

  17. Ranea,J.A., Buchan,D.W., Thornton,J.M. and Orengo,C.A. ( (2004) ) Evolution of protein families and bacterial genome size. J. Mol. Biol., , 336, , 871–887.[CrossRef][Web of Science][Medline] .

  18. Orengo,C.A., Jones,D.T. and Thornton,J.M. ( (1994) ) Protein superfamilies and domain superfolds. Nature, , 372, , 631–634.[CrossRef][Medline] .

  19. Harrison,A., Pearl,F., Sillitoe,I., Slidel,T., Mott,R., Thornton,J. and Orengo,C. ( (2003) ) Recognizing the fold of a protein structure. Bioinformatics, , 19, , 1748–1759.[Abstract/Free Full Text] .


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
M. Banerjee, M. Datta, P. Majumder, D. Mukhopadhyay, and N. P. Bhattacharyya
Transcription regulation of caspase-1 by R393 of HIPPI and its molecular partner HIP-1
Nucleic Acids Res., November 24, 2009; (2009) gkp1011v1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K.-i. Cho, D. Kim, and D. Lee
A feature-based approach to modeling protein-protein interaction hot spots
Nucleic Acids Res., May 1, 2009; 37(8): 2672 - 2687.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. J. Richardson, Q. Gao, C. Mitsopoulous, M. Zvelebil, L. H. Pearl, and F. M. G. Pearl
MoKCa database--mutations of kinases in cancer
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D824 - D831.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. C. Chen and C. Lim
Common physical basis of macromolecule-binding sites in proteins
Nucleic Acids Res., December 1, 2008; 36(22): 7078 - 7087.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. Chalkia, N. Nikolaidis, W. Makalowski, J. Klein, and M. Nei
Origins and Evolution of the Formin Multigene Family That Is Involved in the Formation of Actin Filaments
Mol. Biol. Evol., December 1, 2008; 25(12): 2717 - 2733.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
D. M. Standley, A. R. Kinjo, K. Kinoshita, and H. Nakamura
Protein structure databases with new web services for structural biology and biomedical research
Brief Bioinform, July 1, 2008; 9(4): 276 - 285.
[Abstract] [Full Text] [PDF]


Home page
J. Cell Sci.Home page
S. H. Yoshimura, S. Iwasaka, W. Schwarz, and K. Takeyasu
Fast degradation of the auxiliary subunit of Na+/K+-ATPase in the plasma membrane of HeLa cells
J. Cell Sci., July 1, 2008; 121(13): 2159 - 2168.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. C. Chen and C. Lim
Predicting RNA-binding sites from the protein structure based on electrostatics, evolution and geometry
Nucleic Acids Res., March 1, 2008; 36(5): e29 - e29.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Birzele, G. Csaba, and R. Zimmer
Alternative splicing and protein structure evolution
Nucleic Acids Res., February 2, 2008; 36(2): 550 - 558.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. N.I. Pang, K. Lin, M. A. Wouters, J. Heringa, and R. A. George
Identifying foldable regions in protein sequence from the hydrophobic signal
Nucleic Acids Res., February 2, 2008; 36(2): 578 - 588.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
P. Smialowski, A. J. Martin-Galiano, A. Mikolajka, T. Girschick, T. A. Holak, and D. Frishman
Protein solubility: sequence based prediction and experimental verification
Bioinformatics, October 1, 2007; 23(19): 2536 - 2542.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. A. Marti-Renom, U. Pieper, M. S. Madhusudhan, A. Rossi, N. Eswar, F. P. Davis, F. Al-Shahrour, J. Dopazo, and A. Sali
DBAli tools: mining the protein structure space
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W393 - W397.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C.-H. Tung and J.-M. Yang
fastSCOP: a fast web server for recognizing protein structural domains and SCOP superfamilies
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W438 - W443.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
J. Andrade, A. Karmali, M. A. Carrondo, and C. Frazao
Structure of Amidase from Pseudomonas aeruginosa Showing a Trapped Acyl Transfer Reaction Intermediate State
J. Biol. Chem., July 6, 2007; 282(27): 19598 - 19605.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
X. Zheng, X. Dai, Y. Zhao, Q. Chen, F. Lu, D. Yao, Q. Yu, X. Liu, C. Zhang, X. Gu, et al.
Restructuring of the dinucleotide-binding fold in an NADP(H) sensor protein
PNAS, May 22, 2007; 104(21): 8809 - 8814.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Bateman and R. D. Finn
SCOOP: a simple method for identification of novel protein superfamily relationships
Bioinformatics, April 1, 2007; 23(7): 809 - 814.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Rueda, C. Ferrer-Costa, T. Meyer, A. Perez, J. Camps, A. Hospital, J. L. Gelpi, and M. Orozco
A consensus view of protein dynamics
PNAS, January 16, 2007; 104(3): 796 - 801.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. R. Jefferson, T. P. Walsh, T. J. Roberts, and G. J. Barton
SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein-Protein Interactions
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D580 - D589.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Sonego, M. Pacurar, S. Dhir, A. Kertesz-Farkas, A. Kocsor, Z. Gaspari, J. A.M. Leunissen, and S. Pongor
A Protein Classification Benchmark collection for machine learning
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D232 - D236.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Chivian and D. Baker
Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection
Nucleic Acids Res., October 18, 2006; 34(17): e112 - e112.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. A. Gariev and S. D. Varfolomeev
Hierarchical classification of hydrolases catalytic sites
Bioinformatics, October 15, 2006; 22(20): 2574 - 2576.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
P. H. C. Godoi, R. S. Galhardo, D. D. Luche, M.-A. Van Sluys, C. F. M. Menck, and G. Oliva
Structure of the Thiazole Biosynthetic Enzyme THI1 from Arabidopsis thaliana
J. Biol. Chem., October 13, 2006; 281(41): 30957 - 30966.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J.-M. Yang and C.-H. Tung
Protein structure database search and evolutionary classification
Nucleic Acids Res., August 2, 2006; 34(13): 3646 - 3659.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
G. A. Reeves, J. M. Thornton, and the BioSapiens Network of Excellence
Integrating biological data through the genome
Hum. Mol. Genet., April 15, 2006; 15(suppl_1): R81 - R87.
[Abstract] [Full Text] [PDF]


Home page
Phil Trans R Soc BHome page
R. L Marsden, J. A.G Ranea, A. Sillero, O. Redfern, C. Yeats, M. Maibaum, D. Lee, S. Addou, G. A Reeves, T. J Dallman, et al.
Exploiting protein structure data to explore the evolution of protein function and biological complexity
Phil Trans R Soc B, March 29, 2006; 361(1467): 425 - 440.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
F. Cantini, S. Savino, M. Scarselli, V. Masignani, M. Pizza, G. Romagnoli, E. Swennen, D. Veggi, L. Banci, and R. Rappuoli
Solution Structure of the Immunodominant Domain of Protective Antigen GNA1870 of Neisseria meningitidis
J. Biol. Chem., March 17, 2006; 281(11): 7220 - 7227.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A. E. Kister, A. S. Fokas, T. S. Papatheodorou, and I. M. Gelfand
Strict rules determine arrangements of strands in sandwich proteins.
PNAS, March 14, 2006; 103(11): 4107 - 4110.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Arnold, L. Bordoli, J. Kopp, and T. Schwede
The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling
Bioinformatics, January 15, 2006; 22(2): 195 - 201.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. D. Finn, J. Mistry, B. Schuster-Bockler, S. Griffiths-Jones, V. Hollich, T. Lassmann, S. Moxon, M. Marshall, A. Khanna, R. Durbin, et al.
Pfam: clans, web tools and services
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D247 - D251.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Guda, L. R. Pal, and I. N. Shindyalov
DMAPS: a database of multiple alignments for protein structures
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D273 - D276.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Yeats, M. Maibaum, R. Marsden, M. Dibley, D. Lee, S. Addou, and C. A. Orengo
Gene3D: modelling protein structure, function and evolution
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D281 - D284.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
U. Pieper, N. Eswar, F. P. Davis, H. Braberg, M. S. Madhusudhan, A. Rossi, M. Marti-Renom, R. Karchin, B. M. Webb, D. Eramian, et al.
MODBASE: a database of annotated comparative protein structure models and associated resources
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D291 - D295.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Flores, N. Echols, D. Milburn, B. Hespenheide, K. Keating, J. Lu, S. Wells, E. Z. Yu, M. Thorpe, and M. Gerstein
The Database of Macromolecular Motions: new features added at the decade mark
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D296 - D301.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Kopp and T. Schwede
The SWISS-MODEL Repository: new features and functionalities
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D315 - D318.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Maltsev, E. Glass, D. Sulakhe, A. Rodriguez, M. H. Syed, T. Bompada, Y. Zhang, and M. D'Souza
PUMA2--grid-based high-throughput analysis of genomes and metabolic pathways
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D369 - D372.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Ng, B. Bursteinas, Q. Gao, E. Mollison, and M. Zvelebil
pSTIING: a 'systems' approach towards integrating signalling pathways, interaction and transcriptional regulatory networks in inflammation and cancer
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D527 - D534.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. W. Janes
Bioinformatics analyses of circular dichroism protein reference databases
Bioinformatics, December 1, 2005; 21(23): 4230 - 4238.
[Abstract] [Full Text] [PDF]


Home page
Mol. Pharmacol.Home page
M. Ernst, S. Bruckner, S. Boresch, and W. Sieghart
Comparative Models of GABAA Receptor Extracellular and Transmembrane Domains: Important Insights in Pharmacology and Function
Mol. Pharmacol., November 1, 2005; 68(5): 1291 - 1300.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Ferrer-Costa, H. P. Shanahan, S. Jones, and J. M. Thornton
HTHquery: a method for detecting DNA-binding proteins with a helix-turn-helix structural motif
Bioinformatics, September 15, 2005; 21(18): 3679 - 3680.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Vlahovicek, A. Pintar, L. Parthasarathi, O. Carugo, and S. Pongor
CX, DPX and PRIDE: WWW servers for the analysis and comparison of protein 3D structures
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W252 - W254.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (296K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Pearl, F.
Right arrow Articles by Orengo, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pearl, F.
Right arrow Articles by Orengo, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?