Nucleic Acids Research, 2003, Vol. 31, No. 1 452-455
© 2003 Oxford University Press
The CATH database: an extended protein family resource for structural and functional genomics
Biochemistry and Molecular Biology Department, University College London, University of London, Gower Street, London WC1E 6BT, UK 1 Department of Computer Science, Birkbeck College, University of London, Malet Street, London WC1E 7HX, UK 2 EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
*To whom correspondence should be addressed. Email: c.orengo{at}ucl.ac.uk
ABSTRACT
The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath_new) currently contains 34 287 domain structures classified into 1383 superfamilies and 3285 sequence families. Each structural family is expanded with domain sequence relatives recruited from GenBank using a variety of efficient sequence search protocols and reliable thresholds. This extended resource, known as the CATH-protein family database (CATH-PFDB) contains a total of 310 000 domain sequences classified into 26 812 sequence families. New sequence search protocols have been designed, based on these intermediate sequence libraries, to allow more regular updating of the classification.
Further developments include the adaptation of a recently developed method for rapid structure comparison, based on secondary structure matching, for domain boundary assignment. The philosophy behind CATHEDRAL is the recognition of recurrent folds already classified in CATH. Benchmarking of CATHEDRAL, using manually validated domain assignments, demonstrated that 43% of domains boundaries could be completely automatically assigned. This is an improvement on a previous consensus approach for which only 1020% of domains could be reliably processed in a completely automated fashion. Since domain boundary assignment is a significant bottleneck in the classification of new structures, CATHEDRAL will also help to increase the frequency of CATH updates.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
F. Meyer, R. Overbeek, and A. Rodriguez FIGfams: yet another set of protein families Nucleic Acids Res., November 1, 2009; 37(20): 6643 - 6654. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Petrey, M. Fischer, and B. Honig Structural relationships among proteins with different global topologies and their implications for function annotation strategies PNAS, October 13, 2009; 106(41): 17377 - 17382. [Abstract] [Full Text] [PDF] |
||||
![]() |
P.-H. Chi, B. Pang, D. Korkin, and C.-R. Shyu Efficient SCOP-fold classification and retrieval using index-based protein substructure alignments Bioinformatics, October 1, 2009; 25(19): 2559 - 2565. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Shi, B. Chitturi, and N. V. Grishin ProSMoS server: a pattern-based search using interaction matrix representation of protein structures Nucleic Acids Res., July 1, 2009; 37(suppl_2): W526 - W531. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Marsella, F. Sirocco, A. Trovato, F. Seno, and S. C.E. Tosatto REPETITA: detection and discrimination of the periodicity of protein solenoid repeats by discrete Fourier transform Bioinformatics, June 15, 2009; 25(12): i289 - i295. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Pabuwal and Z. Li Comparative analysis of the packing topology of structurally important residues in helical membrane and soluble proteins Protein Eng. Des. Sel., February 1, 2009; 22(2): 67 - 73. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. S. Cho, Y. Levy, and P. G. Wolynes Quantitative criteria for native energetic heterogeneity influences in the prediction of protein folding kinetics PNAS, January 13, 2009; 106(2): 434 - 439. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Sirocco and S. C. E. Tosatto TESE: generating specific protein structure test set ensembles Bioinformatics, November 15, 2008; 24(22): 2632 - 2633. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Bitto, C. A. Bingman, L. Bittova, D. A. Kondrashov, R. M. Bannen, B. G. Fox, J. L. Markley, and G. N. Phillips Jr. Structure of Human J-type Co-chaperone HscB Reveals a Tetracysteine Metal-binding Domain J. Biol. Chem., October 31, 2008; 283(44): 30184 - 30192. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Nuutinen, H. Tossavainen, K. Fredriksson, P. Pirila, P. Permi, H. Pospiech, and J. E. Syvaoja The solution structure of the amino-terminal domain of human DNA polymerase {varepsilon} subunit B is homologous to C-domains of AAA+ proteins Nucleic Acids Res., September 1, 2008; 36(15): 5102 - 5110. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Schrag, S. Jiralerspong, M. Banville, M. L. Jaramillo, and M. D. O'Connor-McCourt The crystal structure and dimerization interface of GADD45{gamma} PNAS, May 6, 2008; 105(18): 6566 - 6571. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Sammut, R. D. Finn, and A. Bateman Pfam 10 years on: 10 000 families and still growing Brief Bioinform, May 1, 2008; 9(3): 210 - 219. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. I. Yeh, U. Chinte, and S. Du Structure of glycerol-3-phosphate dehydrogenase, an essential monotopic membrane enzyme involved in respiration and metabolism PNAS, March 4, 2008; 105(9): 3280 - 3285. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Pabuwal and Z. Li Network pattern of residue packing in helical membrane proteins and its application in membrane protein structure prediction Protein Eng. Des. Sel., January 3, 2008; (2008) gzm059v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Huseby, K. Shi, C. K. Brown, J. Digre, F. Mengistu, K. S. Seo, G. A. Bohach, P. M. Schlievert, D. H. Ohlendorf, and C. A. Earhart Structure and Biological Activities of Beta Toxin from Staphylococcus aureus J. Bacteriol., December 1, 2007; 189(23): 8719 - 8726. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Nanatani, T. Fujiki, K. Kanou, M. Takeda-Shitaka, H. Umeyama, L. Ye, X. Wang, T. Nakajima, T. Uchida, P. C. Maloney, et al. Topology of AspT, the Aspartate:Alanine Antiporter of Tetragenococcus halophilus, Determined by Site-Directed Fluorescence Labeling J. Bacteriol., October 1, 2007; 189(19): 7089 - 7097. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Guharoy and P. Chakrabarti Secondary structure based analysis and classification of biological interfaces: identification of binding motifs in protein protein interactions Bioinformatics, August 1, 2007; 23(15): 1909 - 1918. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Ferre, Y. Ponty, W. A. Lorenz, and P. Clote DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities Nucleic Acids Res., July 13, 2007; 35(suppl_2): W659 - W668. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Aragao, A. M. Fialho, A. R. Marques, E. P. Mitchell, I. Sa-Correia, and C. Frazao The Complex of Sphingomonas elodea ATCC 31461 Glucose-1-Phosphate Uridylyltransferase with Glucose-1-Phosphate Reveals a Novel Quaternary Structure, Unique among Nucleoside Diphosphate-Sugar Pyrophosphorylase Members J. Bacteriol., June 15, 2007; 189(12): 4520 - 4528. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Shi, Y. Zhong, I. Majumdar, S. Sri Krishna, and N. V. Grishin Searching for three-dimensional secondary structural patterns in proteins with ProSMoS Bioinformatics, June 1, 2007; 23(11): 1331 - 1338. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Krissinel On the relationship between sequence and structure similarities in proteomics Bioinformatics, March 15, 2007; 23(6): 717 - 723. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. H. Greene, T. E. Lewis, S. Addou, A. Cuff, T. Dallman, M. Dibley, O. Redfern, F. Pearl, R. Nambudiry, A. Reid, et al. The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution Nucleic Acids Res., January 12, 2007; 35(suppl_1): D291 - D297. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Bitto, C. A. Bingman, G. E. Wesenberg, J. G. McCoy, and G. N. Phillips Jr. From the Cover: Structure of aspartoacylase, the brain enzyme impaired in Canavan disease PNAS, January 9, 2007; 104(2): 456 - 461. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. McDermott, M. Guerquin, Z. Frazier, A. N. Chang, and R. Samudrala BIOVERSE: enhancements to the framework for structural, functional and contextual modeling of proteins and proteomes Nucleic Acids Res., July 1, 2005; 33(suppl_2): W324 - W325. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Espadaler, R. Aragues, N. Eswar, M. A. Marti-Renom, E. Querol, F. X. Aviles, A. Sali, and B. Oliva Detecting remotely related proteins by their interactions and sequence similarity PNAS, May 17, 2005; 102(20): 7151 - 7156. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. Tuckwell, M. J. Nicholson, C. S. McSweeney, M. K. Theodorou, and J. L. Brookman The rapid assignment of ruminal fungi to presumptive genera using ITS1 and ITS2 RNA secondary structures to produce group-specific fingerprints Microbiology, May 1, 2005; 151(5): 1557 - 1567. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Ginalski, N. V. Grishin, A. Godzik, and L. Rychlewski Practical lessons from protein structure prediction Nucleic Acids Res., April 1, 2005; 33(6): 1874 - 1891. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Yuan and C. Bystroff Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins Bioinformatics, April 1, 2005; 21(7): 1010 - 1019. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu and B. Rost CHOP: parsing proteins into structural domains Nucleic Acids Res., July 1, 2004; 32(suppl_2): W569 - W571. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Y. Lau and D. I. Chasman Functional classification of proteins and protein variants PNAS, April 27, 2004; 101(17): 6576 - 6581. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Thompson, V. Prigent, and O. Poch LEON: multiple aLignment Evaluation Of Neighbours Nucleic Acids Res., February 24, 2004; 32(4): 1298 - 1307. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. H. Wu, A. Nikolskaya, H. Huang, L.-S. L. Yeh, D. A. Natale, C. R. Vinayaka, Z.-Z. Hu, R. Mazumder, S. Kumar, P. Kourtesis, et al. PIRSF: family classification system at the Protein Information Resource Nucleic Acids Res., January 1, 2004; 32(90001): D112 - 114. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bateman, L. Coin, R. Durbin, R. D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E. L. L. Sonnhammer, et al. The Pfam protein families database Nucleic Acids Res., January 1, 2004; 32(90001): D138 - 141. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Lee, S. Fefeu, A. A. Edo-Ukeh, C. A. Orengo, and C. Slingsby EyeSite: a semi-automated database of protein families in the eye Nucleic Acids Res., January 1, 2004; 32(90001): D148 - 152. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Selvam and R. Sasidharan DomIns: a web resource for domain insertions in known protein structures Nucleic Acids Res., January 1, 2004; 32(90001): D193 - 195. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Stebbings and K. Mizuguchi HOMSTRAD: recent developments of the Homologous Protein Structure Alignment Database Nucleic Acids Res., January 1, 2004; 32(90001): D203 - 207. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Eswar, B. John, N. Mirkovic, A. Fiser, V. A. Ilyin, U. Pieper, A. C. Stuart, M. A. Marti-Renom, M. S. Madhusudhan, B. Yerkovich, et al. Tools for comparative protein structure modeling and analysis Nucleic Acids Res., July 1, 2003; 31(13): 3375 - 3380. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Perriere, C. Combet, S. Penel, C. Blanchet, J. Thioulouse, C. Geourjon, J. Grassot, C. Charavay, M. Gouy, L. Duret, et al. Integrated databanks access and sequence/structure analysis services at the PBIL Nucleic Acids Res., July 1, 2003; 31(13): 3393 - 3399. [Abstract] [Full Text] [PDF] |
||||







