Nucleic Acids Research, 2000, Vol. 28, No. 1 267-269
© 2000 Oxford University Press
ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons
1Laboratoire de Génétique Cellulaire and 2Laboratoire de Biologie Moléculaire des Relations Plantes-Microorganismes, INRA/CNRS, BP 27, F-31326 Castanet-Tolosan Cedex, France
Received October 6, 1999; Accepted October 8, 1999.
| ABSTRACT |
|---|
|
|
|---|
ProDom contains all protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases (http://www.toulouse. inra.fr/prodom.html ). ProDom-CG results from a similar domain analysis as applied to completed genomes (http://www.toulouse.inra.fr/prodomCG.html ). Recent improvements to the ProDom database and its server include: scaling up to include sequences from TrEMBL, addition of Pfam-A entries to the set of expert validated families, assignment of stable accession numbers, consistency indicators for domain families, domain arrangements of sub-families and links to Pfam-A.
| INTRODUCTION |
|---|
|
|
|---|
ProDom is a database of protein domain families obtained by an automated analysis of available protein sequence data (1,2). It is useful for analysing the domain arrangements of complex protein families and helps to analyse homology relationships in modular proteins. The clustering of homologous domains provides a rational way of organising protein sequence data. An interactive graphical interface was designed to allow for easy navigation between schematic domain arrangements, multiple alignments, phylogenetic trees, SWISS-PROT entries (3), PROSITE patterns (4), Pfam-A families (5) and 3-D structures in the PDB (6). Alignments and trees can be reduced or developed to facilitate the analysis of sequence relationships within large domain families (7). New sequences can be searched against ProDom and aligned with existing domain families, and modelled on the basis of homologous domains in the PDB.
Recently, we scaled up the process to include TrEMBL sequences in the source database. We have also added Pfam-A families to the set of expert validated families used in the ProDom construction procedure. Other recent improvements in ProDom make it easier to keep track of a protein family across successive releases.
| BUILDING ProDom |
|---|
|
|
|---|
Since version 35, the automated process that builds ProDom has been complemented by the result of an expertise. For some domain families, experts were asked to correct domain boundaries. To increase the number of these expert-validated families, we used the curated part of Pfam (5): the seed alignments of 1403 Pfam-A families were added to the list of 21 ProDom expert-validated multiple alignments and used to build new ProDom families with the PSI-BLAST program (8). Other families are built with an automated process based on a recursive use of PSI-BLAST as described previously (2,9). This process can be applied to any set of protein sequences, provided there are enough sequences available to detect domain boundaries. Since version 99.1, the ProDom source database is SWISS-PROT and its TrEMBL supplement (3). A set of available complete genomes is also used to build ProDom-CG; release 20 was built by automatic clustering of protein domains from 20 complete genomes available on April 8, 1999: four archaea, 14 bacteria and two eukaryotes.
| ProDom STATISTICS |
|---|
|
|
|---|
ProDom, version 99.2, contains 157 167 families (Table 1). ProDom covers >95% of the residues in the source database. The inclusion of TrEMBL represents a 2.4-fold increase in the source database. The ProDom building process scaled up with no major difficulties and with stable results. The average number of domains per sequence remains stable, close to three domains per sequence, with an exponential distribution (Fig. 1a). Surprisingly, domain lengths in ProDom also show an exponential distribution (Fig. 1b), contrary to the expectation of a more balanced distribution centred on the mean. Thus short domains are over-represented in ProDom in its current state, which may be due to numerous sequence ends and inter-domain linkers which are generated as a result of the automated process. Fifty-six percent of all ProDom sequence residues are found in families containing 10 or more members. There are 6264 ProDom entries linked to 1462 Pfam-A entries (v 4.0), 5787 linked to 1056 PROSITE entries (v15) and 2378 linked to PDB.
|
|
| RECENT ProDom IMPROVEMENTS |
|---|
|
|
|---|
Accession numbers
Each ProDom entry now has a unique and stable accession number (AC) that will provide access to the same domain family across successive releases. These numbers are formed with the letters PD followed by exactly six digits (e.g. PD002243). As ProDom is built anew every time, domain families are not exactly conserved from one release to the next. We have derived a tool that links families in release n to families in release n1. For each family in release n, it searches for overlaps with families of release n1; it sorts the hits in decreasing order using the absolute and relative numbers of sub-sequences involved in the overlap; AC numbers are assigned by selecting the first available number in the list, or, if none is left, a new AC is assigned.
Consistency indicator
As ProDom families are computed by an automated process, the sequence homogeneity can vary considerably between families. Some families may include hundreds of nearly identical or alternatively very diverged sequences. We have introduced two indicators that measure the consistency of a family: the diameter and the radius of gyration. The diameter is the maximal distance between two domains of the family. The radius of gyration is the weighted root mean square of the distance between each domain and the family consensus sequence. To help the selection of a sequence that represents the family well, we also indicate which sequence lies closest to the consensus. Among the 43 965 ProDom families containing at least two sequences, 24% had a diameter <10 PAM and 90% <240 PAM; 30% had a radius <10 PAM and 90% <71 PAM. The diameter distribution (Fig. 2) presents two modes, indicating that there are two classes of families in ProDom. In the first class, domains are overly similar, indicating sequence redundancy in the source database. In the second class, families are truly complex and include more diverged homologous domains.
|
Graphical representation
As described earlier (2), the ProDom Web server provides a graphical representation of protein domain arrangements. Each protein is shown on a single line with schematic boxes hypertext-linked to corresponding ProDom entries. Each domain family has a unique representation that is linked to the ProDom accession number, which ensures its stability between successive releases. The graphical representation of the domain arrangements for all proteins sharing a homologous domain can be large and difficult to comprehend. As ProDom families can be divided into sub-families following a phylogenetic tree, it is now possible to display the domain arrangements for all proteins from the same sub-family. For example, ProDom domain PD000612 includes 94 sequences of cytochrome b5 and heme-binding domains of homologous oxidoreductases: the user can readily display the protein domain arrangements specifically for the 42 nitrate reductases or for the three sulfite oxidases (see example of ProDom WWW server usage in Supplementary Material).
| PUTTING ProDom TO USE IN GENOME PROJECTS |
|---|
|
|
|---|
ProDom is widely used to analyse protein domain relationships in genomic sequences. For instance ProDom was used systematically by Marcotte et al. (10) in order to infer proteinprotein interactions on the basis of Rosetta Stone sequence combinations. Another example of a systematic use of ProDom concerns structural genomics. Several projects have recently emerged aiming at a systematic study of the protein structure universe (see for instance http://www.nih.gov/nigms/news/meetings/structural_genomics_targets.html ). These projects require a comprehensive protein family classification scheme in order to adequately sample the protein structure space. We have contributed to such a scheme in the framework of the Protein Structure Initiative (http://www.genome3d.org ). Target proteins for structure determination were selected for 2587 ProDom families on the following criteria: (i) no 3-D structure was available; (ii) they contain at least two members (true family); (iii) they contain at least one protein with only one domain, shorter than 500 amino acids (ProDom domain span is correct); (iv) the two most distant sequences in the family share at least 10% identity (family is homogeneous). Proposed targets are single domain proteins, preferably human. The choice of single domain proteins obviates the need to engineer specific domains and should make expression and purification easier to achieve.
Another concerted effort is the InterPro project aiming at integrating resources for protein families (http://www.ebi.ac. uk/interpro ). We selected 2883 ProDom families that appear to be good candidates for new families to be documented in InterPro. They were selected on the following criteria: (i) they are not referenced in PROSITE 15.0; (ii) they contain at least two members; (iii) they contain at least one single-domain protein from SWISS-PROT, shorter than 500 amino acids; (iv) the similarity between the most distant sequences in the family lies between 10 and 90% identity (family is homogeneous, yet not overly redundant). These criteria ensure that domain boundaries are well defined for each new family.
| AVAILABILITY |
|---|
|
|
|---|
Available via anonymous FTP site: ftp://ftp.toulouse.inra.fr/pub/prodom
or WWW server: http://www.toulouse.inra.fr/prodom.html
http://www.toulouse.inra.fr/prodomCG.html
| SUPPLEMENTARY MATERIAL |
|---|
|
|
|---|
See Supplementary Material available at NAR Online.
| ACKNOWLEDGMENTS |
|---|
|
|
|---|
We wish to thank Amos Bairoch, Alex Bateman, Claude Chevalet, Richard Durbin, Laurent Duret, Alain Guénoche and Manuel Peitsch for stimulating discussions and exchange of information. The ProDom project is supported by the Centre National de la Recherche Scientifique (Genome Initiative) and the European Union (Biotech BIO4-CT980052).
| FOOTNOTES |
|---|
* To whom correspondence should be addressed. Tel: +33 561 28 53 29; Fax: +33 561 28 50 61; Email: dkahn@toulouse.inra.fr
| REFERENCES |
|---|
|
|
|---|
-
1 Sonnhammer,E.L.L. and Kahn,D. (1994) Protein Sci., 3, 482492.[Web of Science][Medline]
2 Corpet,F., Gouzy,J. and Kahn,D. (1999) Nucleic Acids Res., 27, 263267.
3 Bairoch,A. and Apweiler,R. (1999) Nucleic Acids Res., 27, 4954. Updated article in this issue: Nucleic Acids Res. (2000), 28, 4548.
4 Hofmann,K., Bucher,P., Falquet,L. and Bairoch,A. (1999) Nucleic Acids Res., 27, 215219.
5 Bateman,A., Birney,E., Durbin,R., Eddy,S.R., Finn,R.D. and Sonnhammer,E.L.L. (1999) Nucleic Acids Res., 27, 260262. Updated article in this issue: Nucleic Acids Res. (2000), 28, 263266.
6 Abola,E.E., Bernstein,F.C., Bryant,S.H., Koetzle,T.F. and Weng,J. (1987) In Allen,F.H., Bergerhoff,G. and Sievers,R. (eds), Crystallographic Databases-Information Content, Software Systems, Scientific Applications. Data Commission of the International Union of Crystallography, Bonn/Cambridge/Chester, pp. 107132.
7 Corpet,F., Gouzy,J. and Kahn,D. (2000) Bioinformatics, in press.
8 Altschul,S.F., Madden,T.L., Schäffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,J.L. (1997) Nucleic Acids Res., 25, 33893402.
9 Gouzy,J., Corpet,F. and Kahn,D. (1999) Computers Chem., 23, 333340.[Web of Science][Medline]
10 Marcotte,E.M., Pellegrini,M, Ng,H.L., Rice,D.W., Yeates,T.O. and Eisenberg,D. (1999) Science, 285, 751753.
This article has been cited by other articles:
![]() |
C. N.I. Pang, K. Lin, M. A. Wouters, J. Heringa, and R. A. George Identifying foldable regions in protein sequence from the hydrophobic signal Nucleic Acids Res., February 2, 2008; 36(2): 578 - 588. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. del Val, P. Ernst, M Falkenhahn, C. Fladerer, K. H. Glatting, S. Suhai, and A. Hotz-Wagenblatt ProtSweep, 2Dsweep and DomainSweep: protein analysis suite at DKFZ Nucleic Acids Res., July 13, 2007; 35(suppl_2): W444 - W450. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. S. Gesteira, F. Micheli, N. Carels, A. C. Da Silva, K. P. Gramacho, I. Schuster, J. N. Macedo, G. A. G. Pereira, and J. C. M. Cascardo Comparative Analysis of Expressed Genes from Cacao Meristems Infected by Moniliophthora perniciosa Ann. Bot., July 1, 2007; 100(1): 129 - 140. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Lamant, F. Smih, R. Harmancey, P. Philip-Couderc, A. Pathak, J. Roncalli, M. Galinier, X. Collet, P. Massabuau, J.-M. Senard, et al. ApoO, a Novel Apolipoprotein, Is an Original Glycoprotein Up-regulated by Diabetes in Human Heart J. Biol. Chem., November 24, 2006; 281(47): 36289 - 36302. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Kuroda, M. Ito, T. Shikano, T. Awaji, A. Yoda, H. Takeuchi, K. Kinoshita, and S. Miyazaki The Role of X/Y Linker Region and N-terminal EF-hand Domain in Nuclear Translocation and Ca2+ Oscillation-inducing Activities of Phospholipase C{zeta}, a Mammalian Egg-activating Factor J. Biol. Chem., September 22, 2006; 281(38): 27794 - 27805. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Weiner 3rd and E. Bornberg-Bauer Evolution of Circular Permutations in Multidomain Proteins Mol. Biol. Evol., April 1, 2006; 23(4): 734 - 743. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. L Marsden, J. A.G Ranea, A. Sillero, O. Redfern, C. Yeats, M. Maibaum, D. Lee, S. Addou, G. A Reeves, T. J Dallman, et al. Exploiting protein structure data to explore the evolution of protein function and biological complexity Phil Trans R Soc B, March 29, 2006; 361(1467): 425 - 440. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. U. Ferris, Y. Furukawa, T. Minamino, M. B. Kroetz, M. Kihara, K. Namba, and R. M. Macnab FlhB Regulates Ordered Export of Flagellar Components via Autocleavage Mechanism J. Biol. Chem., December 16, 2005; 280(50): 41236 - 41242. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. A. Hinz, M. I. Pastink, L. A. M. van den Broek, J.-P. Vincken, and A. G. J. Voragen Bifidobacterium longum Endogalactanase Liberates Galactotriose from Type I Galactans Appl. Envir. Microbiol., September 1, 2005; 71(9): 5501 - 5510. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. K. Saini and D. Fischer Meta-DP: domain prediction meta-server Bioinformatics, June 15, 2005; 21(12): 2917 - 2920. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Kouchi, T. Shikano, Y. Nakamura, H. Shirakawa, K. Fukami, and S. Miyazaki The Role of EF-hand Domains and C2 Domain in Regulation of Enzymatic Activity of Phospholipase C{zeta} J. Biol. Chem., June 3, 2005; 280(22): 21015 - 21021. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Kunin, S. A. Teichmann, M. A. Huynen, and C. A. Ouzounis The properties of protein family space depend on experimental design Bioinformatics, June 1, 2005; 21(11): 2618 - 2622. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Jakubowski, E. Cascales, V. Krishnamoorthy, and P. J. Christie Agrobacterium tumefaciens VirB9, an Outer-Membrane-Associated Component of a Type IV Secretion System, Regulates Substrate Selection and T-Pilus Biogenesis J. Bacteriol., May 15, 2005; 187(10): 3486 - 3495. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bateman, M. T. G. Holden, and C. Yeats The G5 domain: a potential N-acetylglucosamine recognition domain involved in biofilm formation Bioinformatics, April 15, 2005; 21(8): 1301 - 1303. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Orchard, H. Hermjakob, and R. Apweiler Annotating the Human Proteome Mol. Cell. Proteomics, April 1, 2005; 4(4): 435 - 440. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. J. Su, L. Lu, S. Saxonov, and D. L. Brutlag eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity Nucleic Acids Res., January 1, 2005; 33(suppl_1): D178 - D182. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Bru, E. Courcelle, S. Carrere, Y. Beausse, S. Dalmar, and D. Kahn The ProDom database of protein domain families: more emphasis on 3D Nucleic Acids Res., January 1, 2005; 33(suppl_1): D212 - D215. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. R. Henderson, F. Navarro-Garcia, M. Desvaux, R. C. Fernandez, and D. Ala'Aldeen Type V Protein Secretion Pathway: the Autotransporter Story Microbiol. Mol. Biol. Rev., December 1, 2004; 68(4): 692 - 744. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Zhou, J. Xue, O. Gavrialov, and G. G. Haddad Na+/H+ exchanger 1 deficiency alters gene expression in mouse brain Physiol Genomics, August 11, 2004; 18(3): 331 - 339. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Liu and B. Rost Sequence-based prediction of protein domains Nucleic Acids Res., July 7, 2004; 32(12): 3522 - 3530. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Rost, G. Yachdav, and J. Liu The PredictProtein server Nucleic Acids Res., July 1, 2004; 32(suppl_2): W321 - W326. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Burden, X.-M. Sun, A. B. G. Garcia, and A. K. Soutar Sorting Motifs in the Intracellular Domain of the Low Density Lipoprotein Receptor Interact with a Novel Domain of Sorting Nexin-17 J. Biol. Chem., April 16, 2004; 279(16): 16237 - 16245. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Lorence, B. I. Chevone, P. Mendes, and C. L. Nessler myo-Inositol Oxygenase Offers a Possible Entry Point into Plant Ascorbate Biosynthesis Plant Physiology, March 1, 2004; 134(3): 1200 - 1205. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. del Val, A. Mehrle, M. Falkenhahn, M. Seiler, K.-H. Glatting, A. Poustka, S. Suhai, and S. Wiemann High-throughput protein analysis integrating bioinformatics and experimental assays Nucleic Acids Res., February 3, 2004; 32(2): 742 - 748. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bateman, L. Coin, R. Durbin, R. D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E. L. L. Sonnhammer, et al. The Pfam protein families database Nucleic Acids Res., January 1, 2004; 32(90001): D138 - 141. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kupke, P. Hernandez-Acosta, and F. A. Culianez-Macia 4'-Phosphopantetheine and Coenzyme A Biosynthesis in Plants J. Biol. Chem., October 3, 2003; 278(40): 38229 - 38237. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Hennequin, C. Janoir, M.-C. Barc, A. Collignon, and T. Karjalainen Identification and characterization of a fibronectin-binding protein from Clostridium difficile Microbiology, October 1, 2003; 149(10): 2779 - 2787. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Recchi, B. Sclavi, J. Rauzier, B. Gicquel, and J.-M. Reyrat Mycobacterium tuberculosis Rv1395 Is a Class III Transcriptional Regulator of the AraC Family Involved in Cytochrome P450 Regulation J. Biol. Chem., September 5, 2003; 278(36): 33763 - 33773. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Rost and J. Liu The PredictProtein server Nucleic Acids Res., July 1, 2003; 31(13): 3300 - 3304. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Puntervoll, R. Linding, C. Gemund, S. Chabanis-Davidson, M. Mattingsdal, S. Cameron, D. M. A. Martin, G. Ausiello, B. Brannetti, A. Costantini, et al. ELM server: a new resource for investigating short functional sites in modular eukaryotic proteins Nucleic Acids Res., July 1, 2003; 31(13): 3625 - 3630. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Camon, M. Magrane, D. Barrell, D. Binns, W. Fleischmann, P. Kersey, N. Mulder, T. Oinn, J. Maslen, A. Cox, et al. The Gene Ontology Annotation (GOA) Project: Implementation of GO in SWISS-PROT, TrEMBL, and InterPro Genome Res., April 1, 2003; 13(4): 662 - 672. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. L. Nichols, S. Avery, P. Sen, D. M. Swallow, D. Hahn, and E. Sterchi The maltase-glucoamylase gene: Common ancestry to sucrase-isomaltase with complementary starch digestion activities PNAS, February 4, 2003; 100(3): 1432 - 1437. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Brooksbank, E. Camon, M. A. Harris, M. Magrane, M. J. Martin, N. Mulder, C. O'Donovan, H. Parkinson, M. A. Tuli, R. Apweiler, et al. The European Bioinformatics Institute's data resources Nucleic Acids Res., January 1, 2003; 31(1): 43 - 50. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. B. Zhulin, A. N. Nikolskaya, and M. Y. Galperin Common Extracellular Sensory Domains in Transmembrane Receptors for Diverse Signal Transduction Pathways in Bacteria and Archaea J. Bacteriol., January 1, 2003; 185(1): 285 - 294. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Overbeek, N. Larsen, T. Walunas, M. D'Souza, G. Pusch, E. Selkov Jr, K. Liolios, V. Joukov, D. Kaznadzey, I. Anderson, et al. The ERGOTM genome analysis and discovery system Nucleic Acids Res., January 1, 2003; 31(1): 164 - 171. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. J. Mulder, R. Apweiler, T. K. Attwood, A. Bairoch, D. Barrell, A. Bateman, D. Binns, M. Biswas, P. Bradley, P. Bork, et al. The InterPro Database, 2003 brings increased coverage and new features Nucleic Acids Res., January 1, 2003; 31(1): 315 - 318. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Sasson, A. Vaaknin, H. Fleischer, E. Portugaly, Y. Bilu, N. Linial, and M. Linial ProtoNet: hierarchical classification of the protein space Nucleic Acids Res., January 1, 2003; 31(1): 348 - 352. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Grassot, G. Mouchiroud, and G. Perriere RTKdb: database of receptor tyrosine kinase Nucleic Acids Res., January 1, 2003; 31(1): 353 - 358. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Boeckmann, A. Bairoch, R. Apweiler, M.-C. Blatter, A. Estreicher, E. Gasteiger, M. J. Martin, K. Michoud, C. O'Donovan, I. Phan, et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 Nucleic Acids Res., January 1, 2003; 31(1): 365 - 370. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. K. Attwood, P. Bradley, D. R. Flower, A. Gaulton, N. Maudling, A. L. Mitchell, G. Moulton, A. Nordle, K. Paine, P. Taylor, et al. PRINTS and its automatic supplement, prePRINTS Nucleic Acids Res., January 1, 2003; 31(1): 400 - 402. [Abstract] [Full Text] [PDF] |
||||
![]() |
E.-P. Journet, D. van Tuinen, J. Gouzy, H. Crespeau, V. Carreau, M.-J. Farmer, A. Niebel, T. Schiex, O. Jaillon, O. Chatagnier, et al. Exploring root symbiotic programs in the model legume Medicago truncatula using EST analysis Nucleic Acids Res., December 15, 2002; 30(24): 5579 - 5592. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Turchin and I. S. Kohane Gene homology resources on the World Wide Web Physiol Genomics, December 3, 2002; 11(3): 165 - 177. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Le Bouder-Langevin, I. Capron-Montaland, R. De Rosa, and B. Labedan A Strategy to Retrieve the Whole Set of Protein Modules in Microbial Proteomes Genome Res., December 1, 2002; 12(12): 1961 - 1973. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Deng, S. Mehta, F. Sun, and T. Chen Inferring Domain-Domain Interactions From Protein-Protein Interactions Genome Res., October 1, 2002; 12(10): 1540 - 1548. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Roudier, G. Schindelman, R. DeSalle, and P. N. Benfey The COBRA Family of Putative GPI-Anchored Proteins in Arabidopsis. A New Fellowship in Expansion Plant Physiology, October 1, 2002; 130(2): 538 - 548. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Perelygin, S. V. Scherbik, I. B. Zhulin, B. M. Stockman, Y. Li, and M. A. Brinton Positional cloning of the murine flavivirus resistance gene PNAS, July 9, 2002; 99(14): 9322 - 9327. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. A. Carre and J.-Y. Kim MYB transcription factors in the Arabidopsis circadian clock J. Exp. Bot., July 1, 2002; 53(374): 1551 - 1557. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Orsel, A. Krapp, and F. Daniel-Vedele Analysis of the NRT2 Nitrate Transporter Family in Arabidopsis. Structure and Gene Expression Plant Physiology, June 1, 2002; 129(2): 886 - 896. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Benner, M. D. Caraco, J. M. Thomson, and E. A. Gaucher Planetary Biology--Paleontological, Geological, and Molecular Histories of Life Science, May 3, 2002; 296(5569): 864 - 868. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Caricasole, E. Bettini, C. Sala, R. Roncarati, N. Kobayashi, F. Caldara, K. Goto, and G. C. Terstappen Molecular Cloning and Characterization of the Human Diacylglycerol Kinase beta (DGKbeta ) Gene. ALTERNATIVE SPLICING GENERATES DGKbeta ISOTYPES WITH DIFFERENT PROPERTIES J. Biol. Chem., February 8, 2002; 277(7): 4790 - 4796. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Rigden Use of covariance analysis for the prediction of structural domain boundaries from multiple protein sequence alignments Protein Eng. Des. Sel., February 1, 2002; 15(2): 65 - 77. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. K. Attwood, M. J. Blythe, D. R. Flower, A. Gaulton, J. E. Mabey, N. Maudling, L. McGregor, A. L. Mitchell, G. Moulton, K. Paine, et al. PRINTS and PRINTS-S shed light on protein ancestry Nucleic Acids Res., January 1, 2002; 30(1): 239 - 241. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Pieper, N. Eswar, A. C. Stuart, V. A. Ilyin, and A. Sali MODBASE, a database of annotated comparative protein structure models Nucleic Acids Res., January 1, 2002; 30(1): 255 - 259. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bateman, E. Birney, L. Cerruti, R. Durbin, L. Etwiller, S. R. Eddy, S. Griffiths-Jones, K. L. Howe, M. Marshall, and E. L. L. Sonnhammer The Pfam Protein Families Database Nucleic Acids Res., January 1, 2002; 30(1): 276 - 280. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Broadley, C. M. Demlow, and T. D. Fox Peripheral Mitochondrial Inner Membrane Protein, Mss2p, Required for Export of the Mitochondrially Coded Cox2p C Tail in Saccharomyces cerevisiae Mol. Cell. Biol., November 15, 2001; 21(22): 7663 - 7672. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Watt, P. Kantipong, K. Jongsakul, M. de Souza, and T. Burnouf Passive transfer of scrub typhus plasma to patients with AIDS: a descriptive clinical study QJM, November 1, 2001; 94(11): 599 - 607. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Wuchty Scale-Free Behavior in Protein Domain Networks Mol. Biol. Evol., September 1, 2001; 18(9): 1694 - 1702. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. de Mendonca-Lima, M. Picardeau, C. Raynaud, J. Rauzier, Y.-O. Goguet de la Salmoniere, L. Barker, F. Bigi, A. Cataldi, B. Gicquel, and J.-M. Reyrat Erp, an extracellular protein family specific to mycobacteria Microbiology, August 1, 2001; 147(8): 2315 - 2320. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Murvai, K. Vlahovicek, C. Szepesvari, and S. Pongor Prediction of Protein Functional Domains from Sequences Using Artificial Neural Networks Genome Res., August 1, 2001; 11(8): 1410 - 1417. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Capela, F. Barloy-Hubler, J. Gouzy, G. Bothe, F. Ampe, J. Batut, P. Boistard, A. Becker, M. Boutry, E. Cadieu, et al. Analysis of the chromosome sequence of the legume symbiont Sinorhizobium meliloti strain 1021 PNAS, July 24, 2001; (2001) 161294398. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. V. Kriventseva, W. Fleischmann, E. M. Zdobnov, and R. Apweiler CluSTr: a database of clusters of SWISS-PROT+TrEMBL proteins Nucleic Acids Res., January 1, 2001; 29(1): 33 - 36. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, E. Birney, M. Biswas, P. Bucher, L. Cerutti, F. Corpet, M. D. R. Croning, et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites Nucleic Acids Res., January 1, 2001; 29(1): 37 - 40. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Apweiler, M. Biswas, W. Fleischmann, A. Kanapin, Y. Karavidopoulou, P. Kersey, E. V. Kriventseva, V. Mittard, N. Mulder, I. Phan, et al. Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes Nucleic Acids Res., January 1, 2001; 29(1): 44 - 48. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. A. T. Silverstein, E. Shoop, J. E. Johnson, A. Kilian, J. L. Freeman, T. M. Kunau, I. A. Awad, M. Mayer, and E. F. Retzel The MetaFam Server: a comprehensive protein family resource Nucleic Acids Res., January 1, 2001; 29(1): 49 - 51. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. H. Wu, C. Xiao, Z. Hou, H. Huang, and W. C. Barker iProClass: an integrated, comprehensive and annotated protein classification database Nucleic Acids Res., January 1, 2001; 29(1): 52 - 54. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Murvai, K. Vlahovicek, E. Barta, and S. Pongor The SBASE protein domain library, release 8.0: a collection of annotated protein sequence segments Nucleic Acids Res., January 1, 2001; 29(1): 58 - 60. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Jakt, L. Cao, K. S.E. Cheah, and D. K. Smith Assessing Clusters and Motifs from Gene Expression Data Genome Res., January 1, 2001; 11(1): 112 - 123. [Abstract] [Full Text] |
||||
![]() |
N. C. Kyrpides, C. A. Ouzounis, I. Iliopoulos, V. Vonstein, and R. Overbeek Analysis of the Thermotoga maritima genome combining a variety of sequence similarity and genome context tools Nucleic Acids Res., November 15, 2000; 28(22): 4573 - 4576. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Perrière, L. Duret, and M. Gouy HOBACGEN: Database System for Comparative Genomics in Bacteria Genome Res., March 1, 2000; 10(3): 379 - 385. [Abstract] [Full Text] |
||||
![]() |
X. Zheng, D. Chung, T. K. Takayama, E. M. Majerus, J. E. Sadler, and K. Fujikawa Structure of von Willebrand Factor-cleaving Protease (ADAMTS13), a Metalloprotease Involved in Thrombotic Thrombocytopenic Purpura J. Biol. Chem., October 26, 2001; 276(44): 41059 - 41063. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-M. Hsieh, S. Fukumoto, M. D. Layne, K. Maemura, H. Charles, A. Patel, M. A. Perrella, and M.-E. Lee Striated Muscle Preferentially Expressed Genes alpha and beta Are Two Serine/Threonine Protein Kinases Derived from the Same Gene as the Aortic Preferentially Expressed Gene-1 J. Biol. Chem., November 17, 2000; 275(47): 36966 - 36973. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Capela, F. Barloy-Hubler, J. Gouzy, G. Bothe, F. Ampe, J. Batut, P. Boistard, A. Becker, M. Boutry, E. Cadieu, et al. From the Cover: Analysis of the chromosome sequence of the legume symbiont Sinorhizobium meliloti strain 1021 PNAS, August 14, 2001; 98(17): 9877 - 9882. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Kawaji, C. Schonbach, Y. Matsuo, J. Kawai, Y. Okazaki, Y. Hayashizaki, and H. Matsuda Exploration of Novel Motifs Derived from Mouse cDNA Sequences Genome Res., March 1, 2002; 12(3): 367 - 378. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





















