Nucleic Acids Research Advance Access published online on October 16, 2007
Nucleic Acids Research, doi:10.1093/nar/gkm804
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Database issue |
CMGSDB: integrating heterogeneous Caenorhabditis elegans data sources using compositional data mining
1Department of Computer Science and 2Department of Biochemistry, Virginia Tech, Blacksburg, VA 24061, USA
* To whom correspondence should be addressed. Tel: +1 540 231 7857; Fax: +1 540 231 7040; Email: apati{at}vt.edu
Received August 15, 2007. Revised September 16, 2007. Accepted September 17, 2007.
CMGSDB (Database for Computational Modeling of Gene Silencing) is an integration of heterogeneous data sources about Caenorhabditis elegans with capabilities for compositional data mining (CDM) across diverse domains. Besides gene, protein and functional annotations, CMGSDB currently unifies information about 531 RNAi phenotypes obtained from heterogeneous databases using a hierarchical scheme. A phenotype browser at the CMGSDB website serves this hierarchy and relates phenotypes to other biological entities. The application of CDM to CMGSDB produces chains of relationships in the data by finding two-way connections between sets of biological entities. Chains can, for example, relate the knock down of a set of genes during an RNAi experiment to the disruption of a pathway or specific gene expression through another set of genes not directly related to the former set. The web interface for CMGSDB is available at https://bioinformatics.cs.vt.edu/cmgs/CMGSDB/, and serves individual biological entity information as well as details of all chains computed by CDM.