Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (279K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (24)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Fukuda, Y.
Right arrow Articles by Tomita, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Fukuda, Y.
Right arrow Articles by Tomita, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research Pages 1847-1853  


Comparative study of overlapping genes in the genomes of Mycoplasma genitalium and Mycoplasma pneumoniae
Introduction
Materials and Methods
Results and Discussion
Acknowledgements
References


Comparative study of overlapping genes in the genomes of Mycoplasma genitalium and Mycoplasma pneumoniae

Comparative study of overlapping genes in the genomes of Mycoplasma genitalium and Mycoplasma pneumoniae

Yoko Fukuda, Takanori Washio1 and Masaru Tomita*

Laboratory for Bioinformatics, Department of Environmental Information and 1Graduate School of Media and Governance, Keio University, 5322 Endo, Fujisawa 252, Japan

Received January 12, 1999; Revised and Accepted March 4, 1999

ABSTRACT

Overlapping genes are defined, in this paper, as a pair of adjacent genes whose coding regions are partly overlapping. We systematically analyzed all overlapping genes in the genomes of two closely related species: Mycoplasma genitalium and Mycoplasma pneumoniae. Careful comparisons were made for homologous genes that are overlapped in one species but not in the other. This comparative analysis allows us to propose a model of how overlapping genes emerged in the course of evolution. It was found that overlapping genes were generated primarily due to the loss of a stop codon in either gene, in many cases, the absence of which resulted in elongation of the 3[prime] end of the gene’s coding region. More specifically, the loss of the stop codon took place as a result of the following events: deletion of the stop codon (64.4%), point mutation at the stop codon (4.4%), and frame shift at the end of the coding region (6.7%). Overlapping genes, in a sense, can be thought of as the results of evolutionary pressure to minimize genome size. However, our analysis indicates that many overlapping genes, at least in the genomes of M.genitalium and M.pneumoniae, are due to incidental elongation of the coding regions.

INTRODUCTION

Many overlapping genes have been identified in the genomes of prokaryotes, bacteriophages, animal viruses and mitochondria, some of which have been reported to have functional roles such as in translational coupling (1-5) and negative translational coupling (6,7). Nevertheless, their evolutionary origin, i.e. how they have emerged, is not clearly understood.

We systematically analyzed all overlapping genes in genomes of two closely related species (Table 1). Mycoplasma genitalium (8) and Mycoplasma pneumoniae (9) were selected for our analysis, as the evolutionary distance of these two species is the closest among the 17 species whose complete genomes are currently available (as of October 1998). Many parts of these two genomes (Fig. 1) are highly homologous; almost all of the genes in M.genitalium are present in M.pneumoniae too, and partial orders of genes are often identical in these two genomes (10).

Table 1. The genomes of M.genitalium and M.pneumoniae
  M.genitalium M.pneumoniae
Genome size (bp) 580 074 816 394
Number of genes (CDS) 480 677
Total coding regions 523 714 bp (90.3%) 710 090 bp (87.0%)
Total overlapping regions 2603 bp (0.45%) 4894 bp (0.60%)

There are 162 overlapping gene pairs in the genome of M.genitalium according to the TIGR annotation. The genome of M.pneumoniae, on the other hand, contains 203 overlapping gene pairs. There are 135 homologous overlapping gene pairs which exist in both species. The other 27 and 68 overlapping gene pairs are found only in M.genitalium and M.pneumoniae, respectively. The comparative analysis of these two genomes allows us to propose a model of how overlapping genes have emerged over the course of evolution. In particular, careful comparisons were made for the homologous genes that are overlapped in one species but not in the other.


Figure 1. Three patterns of overlapping genes.


MATERIALS AND METHODS

The whole genome sequence of M.genitalium with annotation (updated in 1998) was downloaded from the TIGR Microbe Database (http://www.tigr.org/tdb/mdb/mdb.html , updated version), and that of M.pneumoniae was from The Mycoplasma Pneumoniae Genome Project (http://mail.zmbh.uni-heidelberg.de/M-Pneumo niae/MP-Home.html ). Information on homologous parts of these two genomes was also obtained from The Mycoplasma Pneumoniae Genome Project.

In this paper, overlapping genes are defined as a pair of adjacent genes whose coding regions are partly overlapping. We first list all overlapping genes in their genomes according to the annotations in the databases. For each overlapping gene pair in one species, we aligned the sequence, using ClustalW (11), with the sequence of the homologous part of the other species. We then classified all the cases according to the three directional patterns as described in Figure 1: ‘end-on’, ‘uni-directional’ and ‘head-on’.

For those genes that overlap in one species but not in the other, we made careful analyses in order to infer the cause of the overlapping. Inferred causes of these events of overlapping were then classified into several types.


Figure 2. 4-base overlapping genes.


Further analyses were conducted for those genes whose 3[prime] ends were elongated by more than 15 amino acids compared with their homologous genes in the other species. FASTA (12) (GenBank version Release 109.0) was used for searching homologous genes in bacteria other than Mycoplasma, and homology of the elongated regions was inspected to see if the regions contain any functionally important sequences. Furthermore, MOTIFS (GCG program package version Unix-8.1 of the Genetics Computer Group, WI) was used for examining possible motifs in these regions.

Annotation of homologous genes is sometimes not in agreement between the two genomes. In particular, many annotational differences were found in determining start codons; i.e. the beginnings of coding regions. This is because the sequence annotations of these two species were made by different software: BLAZE (13) for M.genitalium and FRAMES (GCG program package version Unix-8.1 of the Genetics Computer Group, WI) for M.pneumoniae. We excluded from our discussion those homologous genes whose start codon was assigned differently by the two software packages.

RESULTS AND DISCUSSION

Table 2 summarizes the numbers of overlapping gene pairs in these two genomes. Most overlapping genes are uni-directional, though there are a few ‘end-on’ overlapping genes. Interestingly, there is only one case of a ‘head-on’ overlapping gene.

Table 2. The numbers of overlapping gene pairs
  ->-> ->[larr] [larr] ->
in M.genitalium 134 25 3
in M.pneumoniae 180 20 3
in both species 119 15 1
only in M.genitalium 15 10 2
only in M.pneumoniae 61 5 2

Table 3. The numbers of 4-base overlapping gene pairs
  ttaa ttag ctaa ctag
M.genitalium 2 5 1 0
M.pneumoniae 3 2 0 3

Table 4. The number of 1-base overlapping genes
  taatg tagtg
M.genitalium 34 3
M.pneumoniae 52 2

Table 5. Summary of inferred causes of gene overlapping mutationshift
  Deletion Point Frame Unknown
End-on (-> [larr]) 5 1 2 4
Uni-directional (->->) 24 1 1 7
Total 29 2 3 11

Out of the ‘end-on’ overlapping genes (the direction of which is ->[larr]), many overlap only 1 or 4 bases. Of the 45 overlapping gene pairs (two species together), 16 overlap only 4 bases (Table 3). Mycoplasma genitalium and M.pneumoniae use TAA and TAG for their stop codons. As shown in Figure 2, the complimentary sequence of the stop codon in one gene always includes ‘TA’, which can be a part of the stop codon, TAA or TAG, in the other strand. This explains the large number of 4-base overlapping genes.

Out of the 314 uni-directional overlapping gene pairs (->->), 91 (29.0%) are overlapping only 1 base (Table 4). The overlapped base is either the middle ‘A’ in the sequence ‘TAATG’, which includes TAA for a stop codon of one gene and ATG for a start codon of the other, or the middle ‘G’ in ‘TAGTG’, which includes TAG for the stop codon and GTG for the start codon.



Table 6. Overlapping genes in M.genitalium




Table 7. Overlapping genes in M.pneumoniae

The cause of each case of gene overlapping in one species was inferred from the non-overlapping gene sequence in the other species, and categorized as described with examples in Figure 3a-c. It was found that such overlapping genes were generated primarily due to the loss of a stop codon of either gene, the absence of which resulted in elongation of the 3[prime] end of the gene’s coding region. More specifically, the loss of the stop codon occurred as a result of the following events: deletion of the stop codon (64.4%), point mutation at the stop codon (4.4%), or frame shift at the end of the coding region (6.7%). The results are summarized in Table 5.

Estimated sequence error rate was reported to be <1 in 10 000 bases in M.genitalium (8). The probability of a particular stop codon being replaced due to sequence error is, thus, one in thousands. We therefore consider that sequence errors do not influence the results of our analyses.

All overlapping gene pairs in M.genitalium and M.pneumoniae, and their homologous genes in the other species, length of overlapping regions, and direction of genes are listed in Tables 6 and 6. Those genes that overlap in one species but not in the other are indicated by an asterisk in the ‘remark’ column. Genes marked with ‘-’ in the column are those which we excluded from our analyses due to annotational difference or absence of homologous genes.

While there is a total of 30 ‘end-on’ overlapping gene pairs (->[larr]), there are only five ‘head-on’ overlapping gene pairs ([larr]->). In addition, most uni-directional overlapping genes (->->) were caused by elongation of the 3[prime] ends of the preceding genes, not by elongation of the 5[prime] ends of the subsequent genes. From these observations, we conclude that many overlapping genes were caused by elongation of the 3[prime] end of a coding region, nearly concomitant with the loss of its stop codon.

There are seven cases in which gene elongation in one species has presumably occurred by more than 15 amino acids (Table 8). The FASTA search revealed that, for three of the seven cases, certain elongation was also found in other bacteria. However, the elongated regions are not well conserved between the species. Furthermore, the MOTIFS search found no known motifs in the elongated regions for all the seven cases. These results suggest that the elongated regions have little or no functional role that is biologically important.

Overlapping genes might have been thought of as the results of evolutionary pressure to minimize genome size. However, our analysis indicates that many overlapping genes, at least in the genomes of M.genitalium and M.pneumoniae, are due primarily to incidental elongation of coding regions.

ACKNOWLEDGEMENTS

We thank Rintaro Saito and Masahiko Wada for their support in computer programing. This work was supported in part by a Grant-in-Aid for Scientific Research on Priority Areas ‘Genome Science’ from The Ministry of Education, Science, Sports and Culture in Japan.

Table 8. Comparison with other bacteria and motifs (extended region/gene)
Genes Homolog Elongation Motifs
MG029 Y967_METJA + 0/0
MG034 KITH_BACSU - 0/2
MG176 RS_SYNY3 - 0/1
MG248 YQFN_BACSU + 0/0
MG364 ABRA_PLAFF + 0/0
G07_orf215 RS6_HAEIN - 0/1
G12_orf282a RNC_BACSU - 0/1

   a
   b
   c

Figure 3. Causes of overlapping genes. (a) Deletion of stop codon. Deletion of a segment that includes a stop codon in one of two adjacent non-overlapping genes can result in overlapping genes. (b) Point mutation at stop codon. A stop codon (TAA or TAG) in one of two adjacent non-overlapping genes has been lost due to a point mutation, elongating the gene’s coding region and resulting in overlapping genes. (c) Frame-shift. Frame-shift mutation in coding region of one of two adjacent non-overlapping genes can cause elongation of the gene, resulting in overlapping genes.

REFERENCES

1. Chen,S.M., Takiff,H.E., Barber,A.M., Dubois,G.C., Bardwell,J.C. and Court,D.L. (1990) J. Biol. Chem., 265, 2888-2895. MEDLINE Abstract

2. Normark,S., Bergstrom,S., Edlund,T., Grundstrom,T., Jaurin,B., Lindberg,F.P. and Olsson,O. (1983) Annu. Rev. Genet., 17, 499-525. MEDLINE Abstract

3. Oppenheim,D.S. and Yanofsky,C. (1980) Genetics, 95, 785-795. MEDLINE Abstract

4. Ryoji,M., Berland,R. and Kaji,A. (1981) Proc. Natl Acad. Sci. USA, 78, 5973-5977. MEDLINE Abstract

5. Schumperli,D., McKenney,K., Sobieski,D.A. and Rosenberg,M. (1982) Cell, 30, 865-871. MEDLINE Abstract

6. Davies,R.W. (1980) Nucleic Acids Res., 8, 1765-1782. MEDLINE Abstract

7. Hoess,R.H., Foeller,C., Bidwell,K. and Landy,A. (1980) Proc. Natl Acad. Sci. USA, 77, 2482-2486. MEDLINE Abstract

8. Fraser,C.M., Gocayne,J.D., White,O., Adams,M.D., Clayton,R.A., Fleischmann,R.D., Bult,C.J., Kerlavage,A.R., Sutton,G., Kelley,J.M. et al. (1995) Science, 270, 397-403. MEDLINE Abstract

9. Himmelreich,R., Hilbert,H., Plagens,H., Pirkl,E., Li,B.C. and Herrmann,R. (1996) Nucleic Acids Res., 24, 4420-4449. MEDLINE Abstract

10. Himmelreich,R., Plagens,H., Hilbert,H., Reiner,B. and Herrmann,R. (1997) Nucleic Acids Res., 25, 701-712.

11. Higgins,D.G., Bleasby,A.J. and Fuchs,R. (1992) CABIOS, 8, 189-191.

12. Pearson,W.R. and Lipman,D.J. (1988) Proc. Natl Acad. Sci. USA, 85, 2444-2448.

13. Henikoff,S. and Henikoff,J. G. (1992) Proc. Natl Acad. Sci. USA, 89, 1091. MEDLINE Abstract


*To whom correspondence should be addressed. Tel: +81 466 47 5111; Fax: +81 3 3440 7281; Email: mt@sfc.keio.ac.jp


This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 26 Mar 1999
Copyright©Oxford University Press, 1999.

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
L.-W. Jiang, K.-L. Lin, and C. L. Lu
OGtree: a tool for creating genome trees of prokaryotes based on overlapping genes
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W475 - W480.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
R. Belshaw, O. G. Pybus, and A. Rambaut
The evolution of genome compression and genomic novelty in RNA viruses
Genome Res., October 1, 2007; 17(10): 1496 - 1504.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
C. Kingsford, A. L. Delcher, and S. L. Salzberg
A Unified Model Explaining the Offsets of Overlapping and Near-Overlapping Prokaryotic Genes
Mol. Biol. Evol., September 1, 2007; 24(9): 2091 - 2098.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
S. Pasek, A. Bergeron, J.-L. Risler, A. Louis, E. Ollivier, and M. Raffinot
Identification of genomic features using microsyntenies of domains: Domain teams
Genome Res., June 1, 2005; 15(6): 867 - 874.
[Abstract] [Full Text] [PDF]


Home page
Int. J. Syst. Evol. Microbiol.Home page
K. R. Sakharkar, M. K. Sakharkar, C. Verma, and V. T. K. Chow
Comparative study of overlapping genes in bacteria, with special reference to Rickettsia prowazekii and Rickettsia conorii
Int J Syst Evol Microbiol, May 1, 2005; 55(3): 1205 - 1209.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
Z. I. Johnson and S. W. Chisholm
Properties of overlapping genes are conserved across microbial genomes
Genome Res., November 1, 2004; 14(11): 2268 - 2272.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. R. Denver, S. L. Swenson, and M. Lynch
An Evolutionary Analysis of the Helix-Hairpin-Helix Superfamily of DNA Repair Glycosylases
Mol. Biol. Evol., October 1, 2003; 20(10): 1603 - 1611.
[Abstract] [Full Text]


Home page
Nucleic Acids ResHome page
T. Dandekar, M. Huynen, J. T. Regula, B. Ueberle, C. U. Zimmermann, M. A. Andrade, T. Doerks, L. Sanchez-Pulido, B. Snel, M. Suyama, et al.
Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames
Nucleic Acids Res., September 1, 2000; 28(17): 3278 - 3288.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (279K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (24)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Fukuda, Y.
Right arrow Articles by Tomita, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Fukuda, Y.
Right arrow Articles by Tomita, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?