Skip Navigation



Nucleic Acids Research Advance Access published online on October 5, 2006

Nucleic Acids Research, doi:10.1093/nar/gkl723
This Article
Right arrow Full Text Freely available
Right arrow Print PDF (400K) Freely available
Right arrow Screen PDF (402K) Freely available
Right arrow Supplementary Data
Right arrowOA All Versions of this Article:
34/19/5623    most recent
gkl723v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Noguchi, H.
Right arrow Articles by Takagi, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Noguchi, H.
Right arrow Articles by Takagi, T.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2006 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Computational Biology

MetaGene: prokaryotic gene finding from environmental genome shotgun sequences

Hideki Noguchi*, Jungho Park and Toshihisa Takagi

Department of Computational Biology, Graduate School of Frontier Sciences, University of Tokyo Kashiwa, Chiba 277-8562, Japan

*To whom correspondence should be addressed. Tel: +81 4 7136 3973; Fax: +81 4 7136 4100; Email: hide{at}cb.k.u-tokyo.ac.jp

Received March 18, 2006. Revised September 1, 2006. Accepted September 19, 2006.

Exhaustive gene identification is a fundamental goal in all metagenomics projects. However, most metagenomic sequences are unassembled anonymous fragments, and conventional gene-finding methods cannot be applied. We have developed a prokaryotic gene-finding program, MetaGene, which utilizes di-codon frequencies estimated by the GC content of a given sequence with other various measures. MetaGene can predict a whole range of prokaryotic genes based on the anonymous genomic sequences of a few hundred bases, with a sensitivity of 95% and a specificity of 90% for artificial shotgun sequences (700 bp fragments from 12 species). MetaGene has two sets of codon frequency interpolations, one for bacteria and one for archaea, and automatically selects the proper set for a given sequence using the domain classification method we propose. The domain classification works properly, correctly assigning domain information to more than 90% of the artificial shotgun sequences. Applied to the Sargasso Sea dataset, MetaGene predicted almost all of the annotated genes and a notable number of novel genes. MetaGene can be applied to wide variety of metagenomic projects and expands the utility of metagenomics.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
DNA ResHome page
K. Kurokawa, T. Itoh, T. Kuwahara, K. Oshima, H. Toh, A. Toyoda, H. Takami, H. Morita, V. K. Sharma, T. P. Srivastava, et al.
Comparative Metagenomics Revealed Commonly Enriched Gene Sets in Human Gut Microbiomes
DNA Res, October 16, 2007; (2007) dsm018v2.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.