Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (369K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (391)
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Apweiler, R.
Right arrow Articles by Zdobnov, E. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Apweiler, R.
Right arrow Articles by Zdobnov, E. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2001, Vol. 29, No. 1 37-40
© 2001 Oxford University Press

The InterPro database, an integrated documentation resource for protein families, domains and functional sites

R. Apweiler1,*, T. K. Attwood2, A. Bairoch3, A. Bateman4, E. Birney1, M. Biswas1, P. Bucher5, L. Cerutti4, F. Corpet6, M. D. R. Croning1,2, R. Durbin4, L. Falquet5, W. Fleischmann1, J. Gouzy6, H. Hermjakob1, N. Hulo3, I. Jonassen7, D. Kahn6, A. Kanapin1, Y. Karavidopoulou1, R. Lopez1, B. Marx1, N. J. Mulder1, T. M. Oinn1, M. Pagni5, F. Servant6, C. J. A. Sigrist3 and E. M. Zdobnov1

1EMBL Outstation – European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, 2School of Biological Sciences, The University of Manchester, Manchester, UK, 3Swiss Institute for Bioinformatics, Geneva, Switzerland, 4The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK, 5Swiss Institute for Experimental Cancer Research, Lausanne, Switzerland, 6CNRS/INRA, Toulouse, France and 7Department of Informatics, University of Bergen, HIB, Bergen, Norway

Received August 28, 2000; Revised and Accepted October 31, 2000.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 INTEGRATION METHODS
 CONTENTS OF CURRENT RELEASE
 DATABASE FORMAT, ACCESS AND...
 APPLICATIONS OF INTERPRO
 FUTURE DIRECTIONS
 REFERENCES
 
Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1 000 000 hits from 462 500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp{at}ebi.ac.uk.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 INTEGRATION METHODS
 CONTENTS OF CURRENT RELEASE
 DATABASE FORMAT, ACCESS AND...
 APPLICATIONS OF INTERPRO
 FUTURE DIRECTIONS
 REFERENCES
 
Databases with signatures diagnostic for protein families, domains or functional sites are important tools for the computational functional classification of newly determined sequences that lack biochemical characterisation. During the last decade, several signature recognition and sequence clustering methods have evolved to address different sequence analysis problems, resulting in rather different and, for the most part, independent databases. Currently, the most commonly used signature and sequence cluster databases include PROSITE (1); Pfam (2); PRINTS (3); ProDom (4); and Blocks (5). Diagnostically, these resources have different areas of optimum application owing to the different strengths and weaknesses of their underlying analysis methods.

In terms of family coverage, the signature databases are similar in size but differ in content. While all of the resources share a common interest in protein sequence classification, the focus of each database is different. Pfam, for example, focuses on divergent domains, PROSITE on functional sites and PRINTS focuses on families, specialising in hierarchical definitions from super-family down to sub-family levels in order to describe specific functions. A number of sequence cluster databases, for example ProDom, are also commonly used in sequence analysis to facilitate domain identification. Unlike signature databases, the clustered resources are derived automatically from sequence databases, using different clustering algorithms. Databases like Blocks provide ungapped multiple alignments for protein families.

With the rapid release of raw data from genome sequencing projects, there is a strong dependence on automatic methods for assigning functions to unknown sequences. For this sequence characterisation, we need more reliable, concerted methods for identifying protein family traits and for inheriting functional annotation. InterPro was developed to rationalise this process by creating a single coherent resource for diagnosis and documentation of protein families. This new resource provides an integrated view of a number of commonly used signature databases and provides an intuitive interface for text- and sequence-based searches.


    INTEGRATION METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 INTEGRATION METHODS
 CONTENTS OF CURRENT RELEASE
 DATABASE FORMAT, ACCESS AND...
 APPLICATIONS OF INTERPRO
 FUTURE DIRECTIONS
 REFERENCES
 
Flat-files submitted by each of the member databases, PRINTS, PROSITE, Pfam and ProDom, were systematically merged and dismantled. Overlapping domains, signatures or profiles describing common domains or protein families were merged into a single InterPro entry with a unique accession number (which takes the form IPRxxxxxx, where x is a digit), while those containing no counterpart in other member databases were assigned their own unique accession numbers. This process was complicated by the relationships that can exist, both between entries in the same database and between entries in different databases. Different types of hierarchical family relationships were evident, leading us to recognise ‘sub-types’ and ‘sub-strings’. A sub-string means that a motif or motifs are contained within a region of sequence encoded by a wider pattern (e.g. a PROSITE pattern is typically contained within a PRINTS fingerprint; or a fingerprint might be contained within a Pfam domain). A sub-type means that one or more motifs are specific for a sub-set of sequences captured by another more general pattern and these are described as ‘parent–child’ relationships. Signatures with sub-string relationships have the same IPR numbers, while sub-type parent–child relationships warrant their own IPRs. The domain structure of multidomain proteins is described in a ‘contains/found in’ relationship, where a set of family signatures can contain InterPro entries describing specific domains, but they are not related in the protein family sense. These relationships are demonstrated in Figure 1.




View larger version (51K):
[in this window]
[in a new window]
 
Figure 1. Demonstration of relationships existing between InterPro entries. (Top) Parent–child relationship. This graphical view of three proteins shows IPR000663, which contains signatures describing the Natriuretic peptide family. Each protein has an additional InterPro entry associated with it, containing a fingerprint for more specific classes of Natriuretic peptide. These InterPro entries, IPR002406, IPR002407 and IPR002408 are the children or sub-families of IPR000663. (Bottom) Contains-found in relationship. In these three proteins, IPR000051, the SAM binding motif is a domain found in several different protein families, including IPR001737 (ribosomal RNA adenine dimethylase), IPR000682 (protein-L-isoaspartate(D-aspartate) O-methyltransferase) and IPR000339, a family of ubiqunone methyltransferases. They are not sub-families of the SAM binding domain.

 

    CONTENTS OF CURRENT RELEASE
 TOP
 ABSTRACT
 INTRODUCTION
 INTEGRATION METHODS
 CONTENTS OF CURRENT RELEASE
 DATABASE FORMAT, ACCESS AND...
 APPLICATIONS OF INTERPRO
 FUTURE DIRECTIONS
 REFERENCES
 
Release 2.0 of InterPro was built from Pfam 5.5 (2479 domains), PRINTS 27 (1356 fingerprints), ProDom 2000.1 (1309 domains), PROSITE 16.25 (1424 patterns and profiles) and 236 preliminary profiles. The release contains 3203 entries with 1 315 676 hits in SWISS-PROT and TrEMBL (6). Of these hits, 1 244 893 are considered to be true, 9303 false positive, 4524 false negative, 2885 are partial hits and 54 071 have the status unknown. The SWISS-PROT and TrEMBL match lists are provided by the member databases. An exception here concerns PROSITE pattern hits against TrEMBL, which undergo a different procedure. These are not provided by PROSITE and must therefore be derived by the TrEMBL group. All TrEMBL entries are scanned for PROSITE patterns. If a match is found, its significance is checked by means of a set of secondary patterns computed with the eMOTIF algorithm (7). For each family in PROSITE, the true members are aligned and fed into eMOTIF, which calculates a near optimal set of regular expressions, based on statistical rather than biological evidence. A stringency of 10–9 is used, so that each eMOTIF pattern is expected to produce a random or false positive hit in 10–9 matches. All pattern hits confirmed by eMOTIF are considered true; all others are flagged as unknown.

Individual InterPro entries contain a description of the protein family, domain, repeat or post-translational modification (e.g. N-glycosylation site); a list of member database signatures, Hidden Markov Models (HMMs), profiles or fingerprints associated with the entry; an abstract derived from merged annotation from the member databases; examples of representative sequences; literature references used to create the abstract; and links to tabular or graphical views of the matches to SWISS-PROT and TrEMBL. An example is shown in Figure 2.



View larger version (53K):
[in this window]
[in a new window]
 
Figure 2. An example of an InterPro entry. This is IPR000890, an entry containing signatures describing the acetate and butyrate kinase protein family. The ‘i’ information buttons have links to help files describing, for example, the ‘Family’ concept.

 

    DATABASE FORMAT, ACCESS AND DISTRIBUTION
 TOP
 ABSTRACT
 INTRODUCTION
 INTEGRATION METHODS
 CONTENTS OF CURRENT RELEASE
 DATABASE FORMAT, ACCESS AND...
 APPLICATIONS OF INTERPRO
 FUTURE DIRECTIONS
 REFERENCES
 
To facilitate in-house maintenance, InterPro is managed within a relational database system. However, the InterPro database is also released in two ASCII (text) flat-files in XML (eXtended Markup Language) format, one containing the core InterPro entries and the other containing the protein matches. These come together with a corresponding DTD (Document Type Definition) file, to allow users to keep local InterPro copies on their machines. The InterPro flat-file may be retrieved from the EBI anonymous ftp server (ftp://ftp.ebi.ac.uk/pub/databases/interpro).

InterPro is accessible for interactive use via the EBI Web server (http://www.ebi.ac.uk/interpro), which can also be reached via each of the member databases. The Web interface allows text-based and sequence-based searches using a sequence retrieval system (SRS) (8). The sequence-based searches are done using InterProScan, which combines the search methods from the member databases. The results display matches to the parent databases and the corresponding InterPro entries, providing the positions of the signatures within the sequence and a graphical view of the matches. Detailed results of matches to the individual database search methods are provided via hyperlinks to each of the parent databases. A mail server is available for sequence searches at interproscan{at}ebi.ac.uk Documentation on using the mail server can be obtained by emailing the address with the word ‘help’ in the body of the text.


    APPLICATIONS OF INTERPRO
 TOP
 ABSTRACT
 INTRODUCTION
 INTEGRATION METHODS
 CONTENTS OF CURRENT RELEASE
 DATABASE FORMAT, ACCESS AND...
 APPLICATIONS OF INTERPRO
 FUTURE DIRECTIONS
 REFERENCES
 
InterPro is an international initiative that was conceived in an attempt to streamline the efforts of the signature database providers. By uniting these databases, we capitalise on their individual strengths, producing a single entity that is far greater than the sum of its parts. A primary application of InterPro’s family, domain and functional site definitions will be in annotation and functional classification of uncharacterised sequences. The EBI is using InterPro for enhancing the automated annotation of TrEMBL (9). This is more efficient and reliable than using each of the signature databases separately, because InterPro provides internal consistency checks and deeper coverage. InterPro has also proven its usefulness for whole proteome analysis in the comparative genome analysis of Drosophila melanogaster, Caenorhabditis elegans and Saccharomyces cerevisiae (10).

Another major use of InterPro will be in identifying those families and domains for which the existing discriminators are not optimal and could hence be usefully supplemented with an alternative pattern (e.g. where a regular expression identifies large numbers of false matches it could be useful to develop an HMM or where an HMM covers a vast super-family it could be beneficial to develop discrete family fingerprints, and so on). Alternatively, InterPro is likely to highlight key areas where none of the databases has yet made a contribution and hence where the development of a specific pattern might be useful. For example, sequence groups from ProDom are being analysed using the Pratt pattern discovery tool (11,12) to reveal clusters that can form InterPro families and to create regular expression discriminators. This united approach should thus help us to improve both the utility and the coverage of signature databases, pinpointing weaknesses and allowing us to remedy them efficiently.

As it evolves, InterPro will streamline the analysis of newly determined sequences for the individual user and will make a significant contribution in the demanding task of automatic classification of predicted proteins from genome sequencing projects.


    FUTURE DIRECTIONS
 TOP
 ABSTRACT
 INTRODUCTION
 INTEGRATION METHODS
 CONTENTS OF CURRENT RELEASE
 DATABASE FORMAT, ACCESS AND...
 APPLICATIONS OF INTERPRO
 FUTURE DIRECTIONS
 REFERENCES
 
The InterPro project began by first integrating the databases that provide annotation (Pfam, PRINTS and PROSITE). Various factors rendered a step-wise approach to the development of InterPro desirable. First, the scale of the task of amalgamating the first three databases was immense. The rational merging of apparently equivalent database entries that in fact simultaneously define a specific family, domains within that family or even repeats within those domains, presented an enormous challenge. A second important consideration was that while Pfam, PRINTS and PROSITE are true pattern databases, ProDom is based solely on automatic clustering of sequences by similarity (i.e. discriminators are not derived). Resulting clusters need not have precise biological correlations and some family designations have changed between database versions. The initial integration of ProDom has therefore been limited to well-defined protein families and those entries with corresponding overlaps in the other member databases. The next goal is the further integration of ProDom entries.

In addition, the Blocks database is now using InterPro to replace their old Blocks from PROSITE (J.Henikoff, personal communication). As the current and subsequent Blocks releases will be based on families already in InterPro, the process of cross-referencing between Blocks and InterPro was relatively straightforward and was done for the current InterPro release. Once the founder members of the InterPro consortium have been assimilated into the unified resource, other pattern databases will also be included. First, scheduled for Release 3, will be the SMART resource (13). Ultimately, we hope to include many other protein family databases to give a more comprehensive view of the resources available.


    ACKNOWLEDGEMENTS
 
The InterPro project is supported by grant number BIO4-CT98-0052 of the European Commission. T.K.A. is a Royal Society University Research Fellow.


    FOOTNOTES
 
* To whom correspondence should be addressed. Tel: +44 1223 494 435; Fax: +44 1223 494 468; Email: rolf.apweiler{at}ebi.ac.uk Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 INTEGRATION METHODS
 CONTENTS OF CURRENT RELEASE
 DATABASE FORMAT, ACCESS AND...
 APPLICATIONS OF INTERPRO
 FUTURE DIRECTIONS
 REFERENCES
 

    1 Hofmann,K., Bucher,P., Falquet,L. and Bairoch,A. (1999) The PROSITE database, its status in 1999. Nucleic Acids Res., 27, 215–219.[Abstract/Free Full Text]

    2 Bateman,A., Birney,E., Durbin,R., Eddy,S.R., Howe,K.L. and Sonnhammer,E.L.L. (2000) The Pfam Protein Families Database. Nucleic Acids Res., 28, 263–266.[Abstract/Free Full Text]

    3 Attwood,T.K., Croning,M.D.R., Flower,D.R., Lewis,A.P., Mabey,J.E., Scordis,P., Selley,J.N. and Wright,W. (2000) PRINTS-S: the database formerly known as PRINTS. Nucleic Acids Res., 28, 225–227.[Abstract/Free Full Text]

    4 Corpet,F., Servant,F., Gouzy,J. and Kahn,D. (2000) ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Res., 28, 267–269.[Abstract/Free Full Text]

    5 Henikoff,J.G., Greene,E.A., Pietrokovski,S. and Henikoff,S. (2000) Increased coverage of protein families with the Blocks Database servers. Nucleic Acids Res., 28, 228–230.[Abstract/Free Full Text]

    6 Bairoch,A. and Apweiler,R. (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res., 28, 45–48.[Abstract/Free Full Text]

    7 Nevill-Manning,C.G., Wu,T.D. and Brutlag,D.L. (1998) Highly specific protein sequence motifs for genome analysis. Proc. Natl Acad. Sci. USA, 95, 5865–5871.[Abstract/Free Full Text]

    8 Etzold,T., Ulyanov,A. and Argos,P. (1996) SRS: information retrieval system for molecular biology data banks. Methods Enzymol., 266, 114–128.[ISI][Medline]

    9 Fleischmann,W., Möller,S., Gateau,A. and Apweiler R. (1999) A novel method for automatic functional annotation of proteins. Bioinformatics, 15, 228–233.[Abstract/Free Full Text]

    10 Rubin,G.M., Yandell,M.D., Wortman,J.R., Gabor Miklos,G.L., Nelson,C.R., Hariharan,I.K., Fortini,M.E., Li,P.W., Apweiler,R., Fleischmann,W. et al. (2000) Comparative genomics of the eukaryotes. Science, 287, 2204–2215.[Abstract/Free Full Text]

    11 Jonassen,I., Collins,J.F. and Higgins,D. (1995) Finding flexible patterns in unaligned protein sequences. Protein Sci., 4, 1587–1595.[Abstract]

    12 Jonassen,I. (1997) Efficient discovery of conserved patterns using a pattern graph. Comput. Appl. Biosci., 13, 509–522.[Abstract/Free Full Text]

    13 Schultz,J., Milpetz,F., Bork,P. and Ponting,C.P. (1998) SMART, a simple modular architecture research tool: identification of signaling domains. Proc. Natl Acad. Sci. USA, 95, 5857–5864.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
A. V. Antonov, T. Schmidt, Y. Wang, and H. W. Mewes
ProfCom: a web tool for profiling the complex functionality of gene groups identified from high-throughput data
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W347 - W351.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
L. T. T. Tran-Nguyen, M. Kube, B. Schneider, R. Reinhardt, and K. S. Gibb
Comparative Genome Analysis of "Candidatus Phytoplasma australiense" (Subgroup tuf-Australia I; rp-A) and "Ca. Phytoplasma asteris" Strains OY-M and AY-WB
J. Bacteriol., June 1, 2008; 190(11): 3979 - 3991.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
X. Hu, W. Fan, B. Han, H. Liu, D. Zheng, Q. Li, W. Dong, J. Yan, M. Gao, C. Berry, et al.
Complete Genome Sequence of the Mosquitocidal Bacterium Bacillus sphaericus C3-41 and Comparison with Those of Closely Related Bacillus Species
J. Bacteriol., April 15, 2008; 190(8): 2892 - 2902.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Respir. Cell Mol. Bio.Home page
N. Novershtern, Z. Itzhaki, O. Manor, N. Friedman, and N. Kaminski
A Functional and Regulatory Map of Asthma
Am. J. Respir. Cell Mol. Biol., March 1, 2008; 38(3): 324 - 336.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
T. Kosaka, S. Kato, T. Shimoyama, S. Ishii, T. Abe, and K. Watanabe
The genome of Pelotomaculum thermopropionicum reveals niche-associated evolution in anaerobic microbiota
Genome Res., March 1, 2008; 18(3): 442 - 448.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. J. Casso, S. Liu, D. D. Iwaki, S. K. Ogden, and T. B. Kornberg
A Screen for Modifiers of Hedgehog Signaling in Drosophila melanogaster Identifies swm and mts
Genetics, March 1, 2008; 178(3): 1399 - 1413.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
C. Plainvert, P. Bidet, C. Peigne, V. Barbe, C. Medigue, E. Denamur, E. Bingen, and S. Bonacorsi
A New O-Antigen Gene Cluster Has a Key Role in the Virulence of the Escherichia coli Meningitis Clone O45:K1:H7
J. Bacteriol., December 1, 2007; 189(23): 8528 - 8536.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
K. Nanatani, T. Fujiki, K. Kanou, M. Takeda-Shitaka, H. Umeyama, L. Ye, X. Wang, T. Nakajima, T. Uchida, P. C. Maloney, et al.
Topology of AspT, the Aspartate:Alanine Antiporter of Tetragenococcus halophilus, Determined by Site-Directed Fluorescence Labeling
J. Bacteriol., October 1, 2007; 189(19): 7089 - 7097.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Z. Qian, L. Lu, X. Liu, Y.-D. Cai, and Y. Li
An approach to predict transcription factor DNA binding site specificity based upon gene and transcription factor functional categorization
Bioinformatics, September 15, 2007; 23(18): 2449 - 2454.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
J. Lin, C. M. Gan, X. Zhang, S. Jones, T. Sjoblom, L. D. Wood, D. W. Parsons, N. Papadopoulos, K. W. Kinzler, B. Vogelstein, et al.
A multidimensional analysis of genes mutated in breast and colorectal cancers
Genome Res., September 1, 2007; 17(9): 1304 - 1318.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. K. Saini and D. Fischer
FRalanyzer: a tool for functional analysis of fold-recognition sequence-structure alignments
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W499 - W502.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
P. C. Y. Woo, M. Wang, S. K. P. Lau, H. Xu, R. W. S. Poon, R. Guo, B. H. L. Wong, K. Gao, H.-w. Tsoi, Y. Huang, et al.
Comparative Analysis of Twelve Genomes of Three Novel Group 2c and Group 2d Coronaviruses Reveals Unique Group and Subgroup Features
J. Virol., February 15, 2007; 81(4): 1574 - 1585.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Sugawara, T. Abe, T. Gojobori, and Y. Tateno
DDBJ working on evaluation and classification of bacterial genes in INSDC
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D13 - D15.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J.-i. Takeda, Y. Suzuki, M. Nakao, T. Kuroda, S. Sugano, T. Gojobori, and T. Imanishi
H-DBAS: Alternative splicing database of completely sequenced and manually annotated full-length cDNAs based on H-Invitational
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D104 - D109.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
M. Tian, J. Win, J. Song, R. van der Hoorn, E. van der Knaap, and S. Kamoun
A Phytophthora infestans Cystatin-Like Protein Targets a Novel Tomato Papain-Like Apoplastic Protease
Plant Physiology, January 1, 2007; 143(1): 364 - 377.
[Abstract] [Full Text] [PDF]


Home page
Molecular Cancer TherapeuticsHome page
K. Chen, K. Wang, A. M. Kirichian, A. F. Al Aowad, L. K. Iyer, S. J. Adelstein, and A. I. Kassis
In silico design, synthesis, and biological evaluation of radioiodinated quinazolinone derivatives for alkaline phosphatase-mediated cancer diagnosis and therapy
Mol. Cancer Ther., December 1, 2006; 5(12): 3001 - 3013.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Lee, D.-W. Kim, D. Na, K. H. Lee, and D. Lee
PLPD: reliable protein localization prediction from imbalanced and overlapped datasets
Nucleic Acids Res., October 18, 2006; 34(17): 4655 - 4666.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Gough
Genomic scale sub-family assignment of protein domains
Nucleic Acids Res., July 28, 2006; 34(13): 3625 - 3633.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
R. D. Morin, E. Chang, A. Petrescu, N. Liao, M. Griffith, R. Kirkpatrick, Y. S. Butterfield, A. C. Young, J. Stott, S. Barber, et al.
Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling
Genome Res., June 1, 2006; 16(6): 796 - 803.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Li, J. Li, and L. Wong
Discovering motif pairs at interaction sites from protein sequences on a proteome-wide scale
Bioinformatics, April 15, 2006; 22(8): 989 - 996.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
V. Gonzalez, R. I. Santamaria, P. Bustos, I. Hernandez-Gonzalez, A. Medrano-Soto, G. Moreno-Hagelsieb, S. C. Janga, M. A. Ramirez, V. Jimenez-Jacinto, J. Collado-Vides, et al.
The partitioned Rhizobium etli genome: Genetic and metabolic redundancy in seven interacting replicons
PNAS, March 7, 2006; 103(10): 3834 - 3839.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. V. V. Deevi and A. C. R. Martin
An extensible automated protein annotation tool: standardizing input and output using validated XML
Bioinformatics, February 1, 2006; 22(3): 291 - 296.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
T. Kosaka, T. Uchiyama, S.-i. Ishii, M. Enoki, H. Imachi, Y. Kamagata, A. Ohashi, H. Harada, H. Ikenaga, and K. Watanabe
Reconstruction and Regulation of the Central Catabolic Pathway in the Thermophilic Propionate-Oxidizing Syntroph Pelotomaculum thermopropionicum
J. Bacteriol., January 1, 2006; 188(1): 202 - 210.
[Abstract] [Full Text] [PDF]


Home page
DNA ResHome page
T. Kosuge, T. Abe, T. Okido, N. Tanaka, M. Hirahata, Y. Maruyama, J. Mashima, A. Tomiki, M. Kurokawa, R. Himeno, et al.
Exploration and Grading of Possible Genes from 183 Bacterial Strains by a Common Protocol to Identification of New Genes: Gene Trek in Prokaryote Space (GTPS)
DNA Res, January 1, 2006; 13(6): 245 - 254.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
J. Pfalz, K. Liere, A. Kandlbinder, K.-J. Dietz, and R. Oelmuller
pTAC2, -6, and -12 Are Components of the Transcriptionally Active Plastid Chromosome That Are Required for Plastid Gene Expression
PLANT CELL, January 1, 2006; 18(1): 176 - 197.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
E. S. Rangarajan, Y. Li, E. Ajamian, P. Iannuzzi, S. D. Kernaghan, M. E. Fraser, M. Cygler, and A. Matte
Crystallographic Trapping of the Glutamyl-CoA Thioester Intermediate of Family I CoA Transferases
J. Biol. Chem., December 30, 2005; 280(52): 42919 - 42928.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
S. Ullrich, M. Kube, S. Schubbe, R. Reinhardt, and D. Schuler
A Hypervariable 130-Kilobase Genomic Region of Magnetospirillum gryphiswaldense Comprises a Magnetosome Island Which Undergoes Frequent Rearrangements during Stationary Growth
J. Bacteriol., November 1, 2005; 187(21): 7176 - 7184.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
S. Matsuda, J.-P. Vert, H. Saigo, N. Ueda, H. Toh, and T. Akutsu
A novel representation of protein sequences for prediction of subcellular location using support vector machines
Protein Sci., November 1, 2005; 14(11): 2804 - 2813.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. K. P. Lau, P. C. Y. Woo, K. S. M. Li, Y. Huang, H.-W. Tsoi, B. H. L. Wong, S. S. Y. Wong, S.-Y. Leung, K.-H. Chan, and K.-Y. Yuen
Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats
PNAS, September 27, 2005; 102(39): 14040 - 14045.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
W. Wang, H. Zheng, S. Yang, H. Yu, J. Li, H. Jiang, J. Su, L. Yang, J. Zhang, J. McDermott, et al.
Origin and evolution of new exons in rodents
Genome Res., September 1, 2005; 15(9): 1258 - 1264.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Goesmann, B. Linke, D. Bartels, M. Dondrup, L. Krause, H. Neuweger, S. Oehm, T. Paczian, A. Wilke, and F. Meyer
BRIGEP--the BRIDGE-based genome-transcriptome-proteome browser
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W710 - W716.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
N. Nanda, M. Bao, H. Lin, K. Clauser, L. Komuves, T. Quertermous, P. B. Conley, D. R. Phillips, and M. J. Hart
Platelet Endothelial Aggregation Receptor 1 (PEAR1), a Novel Epidermal Growth Factor Repeat-containing Transmembrane Receptor, Participates in Platelet Contact-induced Activation
J. Biol. Chem., July 1, 2005; 280(26): 24680 - 24689.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
M. A. Patrauchan, C. Florizone, M. Dosanjh, W. W. Mohn, J. Davies, and L. D. Eltis
Catabolism of Benzoate and Phthalate in Rhodococcus sp. Strain RHA1: Redundancies and Convergence
J. Bacteriol., June 15, 2005; 187(12): 4050 - 4063.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. K. Saini and D. Fischer
Meta-DP: domain prediction meta-server
Bioinformatics, June 15, 2005; 21(12): 2917 - 2920.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Bae, B. K. Mallick, and C. G. Elsik
Prediction of protein interdomain linker regions by a hidden Markov model
Bioinformatics, May 15, 2005; 21(10): 2264 - 2270.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Gough
Convergent evolution of domain architectures (is rare)
Bioinformatics, April 15, 2005; 21(8): 1464 - 1471.
[Abstract] [Full Text] <