Nucleic Acids Research Advance Access published online on September 17, 2009
Nucleic Acids Research, doi:10.1093/nar/gkp698
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Computational Biology |
FIGfams: yet another set of protein families
1Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, 2Computation Institute, University of Chicago/Argonne National Laboratory, Chicago, USA and 3Fellowship for the Interpretation of Genomes, Burr Ridge 60527, IL, USA
*To whom correspondence should be addressed. Email: folker{at}anl.gov
Received June 8, 2009. Revised August 5, 2009. Accepted August 6, 2009.
We present FIGfams, a new collection of over 100 000 protein families that are the product of manual curation and close strain comparison. Using the Subsystem approach the manual curation is carried out, ensuring a previously unattained degree of throughput and consistency. FIGfams are based on over 950 000 manually annotated proteins and across many hundred Bacteria and Archaea. Associated with each FIGfam is a two-tiered, rapid, accurate decision procedure to determine family membership for new proteins. FIGfams are freely available under an open source license. These can be downloaded at ftp://ftp.theseed.org/FIGfams/. The web site for FIGfams is http://www.theseed.org/wiki/FIGfams/