Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow Print PDF (1823K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (73)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Yona, G.
Right arrow Articles by Linial, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yona, G.
Right arrow Articles by Linial, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2000, Vol. 28, No. 1 49-55
© 2000 Oxford University Press

ProtoMap: automatic classification of protein sequences and hierarchy of protein families

Golan Yona*, Nathan Linial1 and Michal Linial2

Department of Structural Biology, Fairchild Building D-109, Stanford University, CA 94305, USA, 1Institute of Computer Science, Hebrew University, Jerusalem 91904, Israel and 2Department of Biological Chemistry, Institute of Life Sciences, Hebrew University, Jerusalem 91904, Israel

The ProtoMap site offers an exhaustive classification of all proteins in the SWISS-PROT database, into groups of related proteins. The classification is based on analysis of all pairwise similarities among protein sequences. The analysis makes essential use of transitivity to identify homologies among proteins. Within each group of the classification, every two members are either directly or transitively related. However, transitivity is applied restrictively in order to prevent unrelated proteins from clustering together. The classification is done at different levels of confidence, and yields a hierarchical organization of all proteins. The resulting classification splits the protein space into well-defined groups of proteins, which are closely correlated with natural biological families and superfamilies. Many clusters contain protein sequences that are not classified by other databases. The hierarchical organization suggested by our analysis may help in detecting finer subfamilies in families of known proteins. In addition it brings forth interesting relationships between protein families, upon which local maps for the neighborhood of protein families can be sketched. The ProtoMap web server can be accessed at http://www.protomap.cs.huji.ac.il

* To whom correspondence should be addressed. Tel: +1 650 725 0754; Fax: +1 650 723 8464; Email: golan@gimmel.stanford.edu


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
S. Velankar, C. Best, B. Beuth, C. H. Boutselakis, N. Cobley, A. W. Sousa Da Silva, D. Dimitropoulos, A. Golovin, M. Hirshberg, M. John, et al.
PDBe: Protein Data Bank in Europe
Nucleic Acids Res., October 25, 2009; (2009) gkp916v1.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Wong and M. A. Ragan
MACHOS: Markov clusters of homologous subsequences
Bioinformatics, July 1, 2008; 24(13): i77 - i85.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. V. Tetko, I. V. Rodchenkov, M. C. Walter, T. Rattei, and H.-W. Mewes
Beyond the 'best' match: machine learning annotation of protein sequences by integration of different sources of information
Bioinformatics, March 1, 2008; 24(5): 621 - 628.
[Abstract] [Full Text] [PDF]


Home page
Eukaryot CellHome page
J. E. Coronado, S. Mneimneh, S. L. Epstein, W.-G. Qiu, and P. N. Lipke
Conserved Processes and Lineage-Specific Proteins in Fungal Cell Wall Evolution
Eukaryot. Cell, December 1, 2007; 6(12): 2269 - 2277.
[Abstract] [Full Text] [PDF]


Home page
Eukaryot CellHome page
J. E. Coronado, O. Attie, S. L. Epstein, W.-G. Qiu, and P. N. Lipke
Composition-Modified Matrices Improve Identification of Homologs of Saccharomyces cerevisiae Low-Complexity Glycoproteins.
Eukaryot. Cell, April 1, 2006; 5(4): 628 - 637.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Paccanaro, J. A. Casbon, and M. A. S. Saqi
Spectral clustering of protein sequences
Nucleic Acids Res., March 17, 2006; 34(5): 1571 - 1580.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. D. Finn, J. Mistry, B. Schuster-Bockler, S. Griffiths-Jones, V. Hollich, T. Lassmann, S. Moxon, M. Marshall, A. Khanna, R. Durbin, et al.
Pfam: clans, web tools and services
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D247 - D251.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Petryszak, E. Kretschmann, D. Wieser, and R. Apweiler
The predictive power of the CluSTr database
Bioinformatics, September 15, 2005; 21(18): 3604 - 3609.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
V. Kunin, S. A. Teichmann, M. A. Huynen, and C. A. Ouzounis
The properties of protein family space depend on experimental design
Bioinformatics, June 1, 2005; 21(11): 2618 - 2622.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. Kifer, O. Sasson, and M. Linial
Predicting fold novelty based on ProtoNet hierarchical classification
Bioinformatics, April 1, 2005; 21(7): 1020 - 1027.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Q. J. Su, L. Lu, S. Saxonov, and D. L. Brutlag
eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D178 - D182.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
I. Alam, A. Dress, M. Rehmsmeier, and G. Fuellen
Comparative homology agreement search: An effective combination of homology-search methods
PNAS, September 21, 2004; 101(38): 13814 - 13819.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Kaplan, A. Vaaknin, and M. Linial
PANDORA: keyword-based analysis of protein sets by integration of annotation sources
Nucleic Acids Res., October 1, 2003; 31(19): 5617 - 5626.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. J. Enright, V. Kunin, and C. A. Ouzounis
Protein families and TRIBES in genome sequence space
Nucleic Acids Res., August 1, 2003; 31(15): 4632 - 4638.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
O. Sasson, A. Vaaknin, H. Fleischer, E. Portugaly, Y. Bilu, N. Linial, and M. Linial
ProtoNet: hierarchical classification of the protein space
Nucleic Acids Res., January 1, 2003; 31(1): 348 - 352.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
F. Chetouani, P. Glaser, and F. Kunst
FindTarget: software for subtractive genome analysis
Microbiology, October 1, 2001; 147(10): 2643 - 2649.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Bertone, Y. Kluger, N. Lan, D. Zheng, D. Christendat, A. Yee, A. M. Edwards, C. H. Arrowsmith, G. T. Montelione, and M. Gerstein
SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics
Nucleic Acids Res., July 1, 2001; 29(13): 2884 - 2898.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. M. Harrison, N. Echols, and M. B. Gerstein
Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome
Nucleic Acids Res., February 1, 2001; 29(3): 818 - 830.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. A. T. Silverstein, E. Shoop, J. E. Johnson, A. Kilian, J. L. Freeman, T. M. Kunau, I. A. Awad, M. Mayer, and E. F. Retzel
The MetaFam Server: a comprehensive protein family resource
Nucleic Acids Res., January 1, 2001; 29(1): 49 - 51.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
G. Perrière, L. Duret, and M. Gouy
HOBACGEN: Database System for Comparative Genomics in Bacteria
Genome Res., March 1, 2000; 10(3): 379 - 385.
[Abstract] [Full Text]


Home page
Proc. Natl. Acad. Sci. USAHome page
E. Portugaly and M. Linial
Estimating the probability for a protein to have a new fold: A statistical computational model
PNAS, May 9, 2000; 97(10): 5161 - 5166.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.