Nucleic Acids Research Advance Access published online on November 9, 2009
Nucleic Acids Research, doi:10.1093/nar/gkp940
Database Issue |
The MiST2 database: a comprehensive genomics resource on microbial signal transduction
1Agile Genomics LLC, Mount Pleasant, SC 29466, 2Department of Microbiology, University of Tennessee, Knoxville, TN 37996 and 3BioEnergy Science Center and Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37886, USA
*To whom correspondence should be addressed. Tel: +1 865 974 7687; Fax: +1 865 974 4007; Email: ulrich.luke+sci{at}gmail.com
Received September 15, 2009. Revised October 8, 2009. Accepted October 9, 2009.
The MiST2 database (http://mistdb.com) identifies and catalogs the repertoire of signal transduction proteins in microbial genomes. Signal transduction systems regulate the majority of cellular activities including the metabolism, development, host-recognition, biofilm production, virulence, and antibiotic resistance of human pathogens. Thus, knowledge of the proteins and interactions that comprise these communication networks is an essential component to furthering biomedical discovery. These are identified by searching protein sequences for specific domain profiles that implicate a protein in signal transduction. Compared to the previous version of the database, MiST2 contains a host of new features and improvements including the following: draft genomes; extracytoplasmic function (ECF) sigma factor protein identification; enhanced classification of signaling proteins; novel, high-quality domain models for identifying histidine kinases and response regulators; neighboring two-component genes; gene cart; better search capabilities; enhanced taxonomy browser; advanced genome browser; and a modern, biologist-friendly web interface. MiST2 currently contains 966 complete and 157 draft bacterial and archaeal genomes, which collectively contain more than 245 000 signal transduction proteins. The majority (66%) of these are one-component systems, followed by two-component proteins (26%), chemotaxis (6%), and finally ECF factors (2%).