Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow Print PDF (133K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (19)
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Pandit, S. B.
Right arrow Articles by Srinivasan, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pandit, S. B.
Right arrow Articles by Srinivasan, N.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2002, Vol. 30, No. 1 289-293
© 2002 Oxford University Press

SUPFAM—a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes

Shashi B. Pandit1, Dilip Gosar1, S. Abhiman1,2, S. Sujatha1, Sayali S. Dixit1,2,3, Natasha S. Mhatre1, R. Sowdhamini2 and N. Srinivasan1,*

1Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India, 2National Centre for Biological Sciences, Tata Institute of Fundamental Research, UAS-GKVK Campus, Bangalore 560 065, India and 3Biotechnology Centre, Indian Institute of Technology—Bombay, Powai, Mumbai 400 076, India

Members of a superfamily of proteins could result from divergent evolution of homologues with insignificant similarity in the amino acid sequences. A superfamily relationship is detected commonly after the three-dimensional structures of the proteins are determined using X-ray analysis or NMR. The SUPFAM database described here relates two homologous protein families in a multiple sequence alignment database of either known or unknown structure. The present release (1.1), which is the first version of the SUPFAM database, has been derived by analysing Pfam, which is one of the commonly used databases of multiple sequence alignments of homologous proteins. The first step in establishing SUPFAM is to relate Pfam families with the families in PALI, which is an alignment database of homologous proteins of known structure that is derived largely from SCOP. The second step involves relating Pfam families which could not be associated reliably with a protein superfamily of known structure. The profile matching procedure, IMPALA, has been used in these steps. The first step resulted in identification of 1280 Pfam families (out of 2697, i.e. 47%) which are related, either by close homologous connection to a SCOP family or by distant relationship to a SCOP family, potentially forming new superfamily connections. Using the profiles of 1417 Pfam families with apparently no structural information, an all-against-all comparison involving a sequence-profile match using IMPALA resulted in clustering of 67 homologous protein families of Pfam into 28 potential new superfamilies. Expansion of groups of related proteins of yet unknown structural information, as proposed in SUPFAM, should help in identifying ‘priority proteins’ for structure determination in structural genomics initiatives to expand the coverage of structural information in the protein sequence space. For example, we could assign 858 distinct Pfam domains in 2203 of the gene products in the genome of Mycobacterium tubercolosis. Fifty-one of these Pfam families of unknown structure could be clustered into 17 potentially new superfamilies forming good targets for structural genomics. SUPFAM database can be accessed at http://pauling.mbu.iisc.ernet.in/~supfam.

* To whom correspondence should be addressed. Tel: +91 80 309 2837; Fax: +91 80 360 0535; Email: ns{at}mbu.iisc.ernet.in


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
O. Camoglu, T. Can, and A. K. Singh
Integrating multi-attribute similarity networks for robust representation of the protein space
Bioinformatics, July 1, 2006; 22(13): 1585 - 1592.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
O. Krishnadev, N. Rekha, S. B. Pandit, S. Abhiman, S. Mohanty, L. S. Swapna, S. Gore, and N. Srinivasan
PRODOC: a resource for the comparison of tethered protein domain architectures with in-built information on remotely related domain families
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W126 - W129.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
P. K. Shah, P. Aloy, P. Bork, and R. B. Russell
Structural similarity to bridge sequence space: Finding new families on the bridges
Protein Sci., May 1, 2005; 14(5): 1305 - 1314.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Pugalenthi, A. Bhaduri, and R. Sowdhamini
GenDiS: Genomic Distribution of protein structural domain Superfamilies
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D252 - D255.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
R. I. Sadreyev, D. Baker, and N. V. Grishin
Profile-profile comparisons by COMPASS predict intricate homologies between protein families
Protein Sci., October 1, 2003; 12(10): 2262 - 2272.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. S. Gowri, S. B. Pandit, P. S. Karthik, N. Srinivasan, and S. Balaji
Integration of related sequences with protein three-dimensional structural families in an updated version of PALI database
Nucleic Acids Res., January 1, 2003; 31(1): 486 - 488.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.