Nucleic Acids Research Advance Access published online on October 25, 2009
Nucleic Acids Research, doi:10.1093/nar/gkp876
Computational Biology |
Martini: using literature keywords to compare gene sets
European Molecular Biology Laboratory, 69117 Heidelberg, Germany
*To whom correspondence should be addressed. Email: sean.odonoghue{at}embl.de
Received September 5, 2009. Revised September 29, 2009. Accepted October 1, 2009.
Life scientists are often interested to compare two gene sets to gain insight into differences between two distinct, but related, phenotypes or conditions. Several tools have been developed for comparing gene sets, most of which find Gene Ontology (GO) terms that are significantly over-represented in one gene set. However, such tools often return GO terms that are too generic or too few to be informative. Here, we present Martini, an easy-to-use tool for comparing gene sets. Martini is based, not on GO, but on keywords extracted from Medline abstracts; Martini also supports a much wider range of species than comparable tools. To evaluate Martini we created a benchmark based on the human cell cycle, and we tested several comparable tools (CoPub, FatiGO, Marmite and ProfCom). Martini had the best benchmark performance, delivering a more detailed and accurate description of function. Martini also gave best or equal performance with three other datasets (related to Arabidopsis, melanoma and ovarian cancer), suggesting that Martini represents an advance in the automated comparison of gene sets. In agreement with previous studies, our results further suggest that literature-derived keywords are a richer source of gene-function information than GO annotations. Martini is freely available at http://martini.embl.de.