Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (148K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (87)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Gustafsson, C
Right arrow Articles by Santi, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gustafsson, C
Right arrow Articles by Santi, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 1996 Oxford University Press 3756-3763

Footnote

Identification of new RNA modifying enzymes by iterative genome search using known modifying enzymes as probes

Identification of new RNA modifying enzymes by iterative genome search using known modifying enzymes as probes Claes Gustafsson , Ralph Reid 1 , Patricia J. Greene and Daniel V. Santi*

Departments of Pharmaceutical Chemistry and of Biochemistry and Biophysics, University of California, San Francisco , CA 94143-0448, USA and 1 Biomolecular Resource Center, University of California, San Francisco , CA 94143-0541, USA

Received June 14, 1996 ; Revised and Accepted August 12, 1996

ABSTRACT

The complete nucleotide sequences of the Haemophilus influenzae and Mycoplasma genitalium genomes and the partially sequenced Escherichia coli chromosome were analyzed to identify open reading frames (ORFs) likely to encode RNA modifying enzymes. The protein sequences of known RNA modifying enzymes from three families-m 5 U methyltransferases, [Psi] synthases and 2 ' - O methyltransferases-were used as probes to search sequence databases for homologs. ORFs identified as homologous to the initial probes were retrieved and used as new probes against the databases in an iterative manner until no more homologous ORFs could be identified. Using this approach, we have identified two new m 5 U methyltransferases, seven new [Psi] synthases and four new 2 ' - O methyltransferases in E.coli . Many of the ORFs found in E.coli have direct genetic counterparts (orthologs) in one or both of H.influenzae and M.genitalium . Since there is a near-complete knowledge of RNA modifications in E.coli , functional activities of the proteins encoded by the identified ORFs were proposed based on the level of conservation of the ORFs and the modified nucleotides.

INTRODUCTION

As of April 1996, high throughput genomic sequencing has provided hundreds of viral genome sequences, ~20 organellar sequences and the complete nucleotide sequences of two free living organisms: Haemophilus influenzae (1.8 Mbp; 1 ) and Mycoplasma genitalium (0.6 Mbp; 2 ). Also, 74% of the Escherichia coli chromosome (4.7 Mbp) has been reported in the E.coli database collection release 25 (January 1996; 3 ). The numerous open reading frames (ORFs) now identified brings the genome projects to the next level of analysis: to identify the functions of the uncharacterized ORFs. A practical first approach towards this objective involves prediction of function through homology comparisons of known proteins to uncharacterized ORFs. Indeed, such comparisons of E.coli ORFs have led to assignments of many general functions ( 4 ).

RNA modifications have been well characterized in E.coli . Mature RNA contains many modified nucleotides, of which three are 5-methyluridines (m 5 U), seventeen are pseudouridines (5-ribosyluridine, [Psi]) and seven are 2'- O methylated nucleosides (Nm, where N denotes A, C, G or U) (reviewed in 5 ). Six of the enzymes that catalyze formation of the m 5 U, [Psi] and Nm nucleotide modifications have been identified in E.coli .

In this paper, we attempt to identify ORFs likely to encode RNA modifying enzymes. We used the amino acid sequence of eight known RNA modifying enzymes, six from E.coli and one each from Streptomyces azureus and Saccharomyces cerevisiae , as probes to search the databases for homologous ORFs. The probes used represent enzymes which catalyze three types of RNA modifications; uracil m 5 U methyltransferases, pseudouridine synthases and 2'- O methyltransferases. By iterative homology searches, we have identified ORFs in E.coli , H.influenzae and M.genitalium likely to encode enzymes with similar function. These ORFs, together with knowledge of RNA modifications in E.coli , allowed us to predict specific substrates for many of the ORF encoded proteins. In E.coli , eleven ORFs which are likely to code for RNA modifying enzymes were found in addition to the six previously characterized. These seventeen enzymes could account for most or all of the three m 5 Us, seventeen [Psi]s and seven Nms present in E.coli rRNA and tRNA. We also assigned the direct genetic counterparts, or orthologs, of the ORFs found in E.coli to ORFs present in H.influenzae and, where applicable, M.genitalium .

The genomic search procedure described here exploited the following information: (i) knowledge of a set of related endproducts, the production of which requires a set of potentially related unknown enzymes, (ii) the amino acid sequence of one or more related enzymes to use as initial probes. This procedure should be generally applicable to other situations which meet similar criteria.

MATERIAL AND METHODS

Amino acid sequences of proteins of known function (Table 1 ) were used as probes in searches for homologous sequences present in the GenBank database (National Center for Biotechnology Information), the SwissProt database (EBI EMBL) and The Institute of Genomic Research (TIGR) databases for Haemophilus influenzae Rd and Mycoplasma genitalium . The initial searches were carried out using the BLAST program ( 6 ) or the GRASTA program [a modification of FASTA ( 7 )]. These programs are part of the network service provided by GenBank and TIGR respectively. ORFs having a probability of an accidental match of 10 -4 or less to the respective probe were retrieved from the databases and further analyzed.

Table 1 . Genes encoding known RNA modifying enzymes used as probes in the database searches
Organism

Gene

Activity a

Acc. number b

E.coli

trmA

tRNA U54 -> m 5 U

P23003

E.coli

truA

tRNA U38, U39, U40 -> [Psi]

P07649

E.coli

truB

tRNA U55 -> [Psi]

P09171

E.coli

rluA

23S U746, tRNA U32 -> [Psi]

P39219

E.coli

rsuA

16S U516 -> [Psi]

P33918

E.coli

spoU

tRNA G18 -> Gm

P19396

St.azureus c

tsr

23S A1067 -> Am

P18644

Sa.cerevisiae c

PET56 d

23S G2251 -> Gm

431760

a Numbering and designation of RNA according to E.coli . b SwissProt accession number. c St . denotes Streptomyces and Sa . denotes Saccharomyces . d The PET56 protein methylates the yeast mitochondrial 23S equivalent.

Detailed analysis of the ORFs retrieved in the GRASTA and BLAST search was performed using sequence analysis programs that are part of the Wisconsin sequence analysis package ( 8 ). PILEUP was used to create multiple sequence alignments of related sequences. LINEUP was used as screen editor to edit the multiple sequence alignments, and PRETTY was used to display the sequence alignments. The level of similarity and identity between the probe used and the identified sequences were determined using the GAP program. The gap creation penalty was set to 3.0 and gap extension penalty was set to 0.1. Before the final sequence analysis, the compiled amino acid sequences were truncated uniformly at the C- and N-terminal ends to provide sequences of similar overall length as shown in the figures (Figs 2 , 3 and 4 ). The PILEUP program was also used to generate and plot a graph drawn in unrooted tree format (dendrogram) which shows the clustering relationships used to create the alignments.

The multiple sequence alignments were used to identify conserved amino acid sequence motifs. ORFs having the conserved motifs were considered true homologs and were used as probes in an iterative manner, i.e. the identified ORFs were used to search the databases for additional homologs using the same criteria as with the initial probes. As more ORFs were added to the separate sequence alignments (one alignment for each family of enzymes), the identity of conserved sequence motifs was further improved. The identities of the motifs were further refined by aligning homologous sequences from organisms other than E.coli , H.influenzae and M.genitalium found in the GenBank database. These sequences, many of which are incomplete ORFs, have not been included in the alignment figures of this paper.

Escherichia coli and H.influenzae ORFs were considered as orthologous gene pairs when the ORFs paired together strongly in the dendrogram generated by PILEUP with branch points well separated from the next most distant branch points. In the comparisons presented here, the strongly paired ORFs had an identity score >30% and a similarity score >50% as determined by the GAP program. E.coli and M.genitalium ORFs were considered as orthologous gene pairs when the ORFs paired together strongly in the dendrogram and had an identity score >20% and a similarity score >40%. In this context identity means having the same amino acid at the same position whereas similarity is having a similar amino acid as defined by the GAP program at the same position in the two ORFs. The homology and similarity analyses were based on the truncated ORFs as noted above.

All rRNA designations and nucleotide numbering reflect the E.coli equivalent rRNA and nucleotide position respectively.

RESULTS AND DISCUSSION


Figure 1 . Dendrograms representing the relationship of gene homologs. Distance along the linear axis of the dendrograms is proportional to the difference between sequences. ( A ) Dendrogram representing the relationship of trmA homologs found in E.coli and H.influenzae . ( B ) Dendrogram representing the relationship of rluA and rsuA homologs found in E.coli , H.influenzae and M.genitalium . ( C ) Dendrogram representing the relationship of spoU , tsr and PET56 homologs found in E.coli , H.influenzae and M.genitalium . We have searched genomic databases in an iterative manner using as initial probes the deduced amino acid sequences of eight genes encoding RNA modifying enzymes (Table 1 ). First we searched the H.influenzae and M.genitalium databases for homologous ORFs and retrieved sequences having a probability of an accidental match of 10 -4 or less to the probe. The retrieved ORFs were aligned with the probe(s), analyzed for the presence of conserved motifs and subsequently used again as probes against the GenBank database to find additional homologous ORFs. This was repeated until no more homologous ORFs could be identified. ORFs were considered as orthologous gene pairs (i.e. encoding `the same protein' in different organisms) when the ORFs (i) paired together in a dendrogram, (ii) shared conserved motifs, and (iii) showed homology as defined in the Materials and Methods section.

m 5 U Methyltransferases

The single m 5 U methyltransferase characterized to date (the TrmA protein encoded by the trmA gene) ( 9 ) methylates U54 of all E.coli tRNAs. We used the deduced amino acid sequence of trmA ( 10 ) to identify other ORFs that are likely to encode additional m 5 U methyltransferases. A homology search of the H.influenzae and M.genitalium genome databases employing the GRASTA program identified three ORFs in the H.influenzae genome as trmA homologs with probabilities of an accidental match of 10 -7 or less (Table 2 ). The minimal genome of M.genitalium did not encode any trmA homologs; notably, Mycobacteria sp. are the only eubacterial organisms which do not have m 5 U54 in the tRNA ( 11 ). One ORF of H.influenzae , HI848, was identified as ortholog of trmA . The other trmA homologs identified in H.influenzae were HI333 and HI958. Reverse searching of the GenBank database using HI333 and HI958 as probes identified apparent orthologs for each in E.coli . The ortholog of HI333 is the E.coli ORF ygcA , and the ortholog of HI958 is a C-terminal portion of an E.coli ORF which we designated Cter . The presence of an AdoMet binding site in ygcA has previously been noted ( 4 ). The three pairs of orthologs from E.coli and H.influenzae within the m 5 U methyltransferase family are displayed as a dendrogram in Figure 1 A.


Figure 2 . Alignment of trmA homologs found in E.coli and H.influenzae . The 5' half of the DNA sequence encoding Cter is not currently available, therefore only the C-terminal part was used in the alignment. Otherwise, the sequences shown have been truncated at the N- and C-terminal ends to produce ORFs of uniform length. These are the truncated sequences used in the sequences analysis. Residues were considered as consensus only when present in five of the five sequences. The numbering of residues reflects the trmA sequence. Consensus abbreviations: h, hydrophobic; 1, EDQN; 2, VLIM; 3, GPA; 4, ST; 5, KR; 6, FYW.


Figure 3 . Alignment of rluA and rsuA homologs found in E.coli , H.influenzae and M.genitalium . The C-terminal part of ORF HI1435 has been deleted due to poor nucleotide sequence in the corresponding database entry. Otherwise, the sequences shown have been truncated at the N- and C-terminal ends to produce ORFs of uniform length. These are the truncated sequences used in the sequences analysis. The rluA and the rsuA subfamilies are found on the upper and lower half of the alignment, respectively. Residues were considered as consensus only when present in 15 of the 17 sequences. The numbering of residues reflects the rluA sequence. Consensus abbreviations as above.


Figure 4 . Alignment of spoU , tsr and PET56 homologs found in E.coli , H.influenzae and M.genitalium . The sequences shown have been truncated at the N- and C-terminal ends to produce ORFs of uniform length. These are the truncated sequences used in the sequences analysis. Residues were considered as consensus only when present in 10 of the 11 sequences. The numbering of residues reflects the spoU sequence. Consensus abbreviations as above.



Table 2 . RNA m 5 U methyltransferases
E.coli

H.influenzae

Gene

Accession number

Gene a

Identity b (%)

Similarity b (%)

Function

Activity

trmA (probe)

P23003

HI848

64

78

known

tRNA U54 -> m 5 U

ygcA

U29580

HI333

47

66

predicted

23S U747 -> m 5 U or

23S U1939 -> m 5 U

Cter c

X69108

HI958

(52)

(74)

predicted

23S U747 -> m 5 U or

23S U1939 -> m 5 U

a The H.influenzae ORFs are denoted according to Fleischmann et al . (1). b The percentages of identity and similarity of the corresponding amino acid sequence to the E.coli ortholog are calculated using the GAP program from the GCG package, which aligns sequences using the Needleman-Wunsch algorithm. The ORFs were truncated at the N- and C-terminal ends as shown in Figure 2 before determining the identity and similarity. c The 5' half of the DNA sequence encoding Cter is not currently available, therefore only the C-terminal was aligned. The identity and similarity values thus only reflect the C-terminal part of the alignment.

The alignment of the deduced amino acid sequence of trmA and its five homologs showed four conserved sequence motifs (Fig. 2 ). Motif I, 2-h-1-L-6-C-G-x-G-x-F-x-2-x-h-A-x 10 -E, (abbreviations used: h, hydrophobic; 1, EDQN; 2, VLIM; 3, GPA; 4, ST; 5, KR; 6, FYW; x, any residue) shares considerable homology to the established consensus of the S -adenosyl-L-methionine (AdoMet) binding motif identified in other methylases that use this cofactor ( 12 ). This motif contributes directly to the binding pocket for AdoMet in the three dimensional structure of DNA m 5 C methyltransferase Hha I ( 13 ). Conserved motif III (I-2-Y-x-S-C-N-3-x-T-2) contains the catalytic cysteine residue of the TrmA protein which forms a covalent adduct to C6 of U during catalysis ( 14 ). Two additional conserved motifs, II and IV, were found. Motif II (1-x-2-h-2-1-P-3-R-x-G) is located directly upstream from the catalytic cysteine motif, and motif IV (2-h-D-x-F-P-x-T-x-H-h-E) reflects the extensive amino acid sequence similarity at the C-terminal. The presence of conserved motifs I-IV among the trmA homologs supports the prediction that the ORFs identified encode m 5 U methyltransferases.

Other than the TrmA-dependent m 5 U54 in tRNA, the only other m 5 Us that have been found in E.coli RNA are m 5 U747 and m 5 U1939 in 23S rRNA. The methyltransferases that catalyze these latter two modifications have not been identified. Since we have identified only two other ORFs in E.coli that code for m 5 U methyltransferases, we believe that the E.coli ORFs ygcA and Cter encode the two putative 23S rRNA m 5 U-methyltransferases. Although the presence of these modifications have not been established in H.influenzae , we further believe that HI333 and HI958 gene products catalyze the corresponding modifications in H.influenzae .

[Psi] Synthases

Four E.coli [Psi] synthases have been identified (Table 1 ). The product of the truA gene, TruA (also known as HisT or [Psi] synthase I), converts U residues to [Psi] in the anticodon arm of some tRNAs ( 15 ). The truB gene product, TruB, forms [Psi] at U55 in the T-arm of all E.coli tRNAs ( 16 ). The product of the rsuA gene, RsuA, introduces the only [Psi] found in 16S rRNA ( 17 ). The product of the rluA gene, RluA, has two enzymatic activities; it catalyzes [Psi] formation at U746 in domain II of 23S rRNA and also catalyzes [Psi] formation at U32 in some tRNAs ( 18 ).

The amino acid sequences of the four known [Psi] synthase genes were used as probes for iterative searching of the genome sequences of E.coli , H.influenzae and M.genitalium . The search using the truA and truB probes identified an ortholog for each in H.influenzae , and an ortholog for truA , but not truB , in M.genitalium ; the search did not identify any non-orthologous homologous ORFs. Searches using the rluA and rsuA probes yielded two families of homologs, one to each probe. A distant, but distinct homology exists between the rluA and rsuA families (Figs 1 B and 3 ).

Table 3 . RNA [Psi] synthases
E.coli

H.influenzae

M.genitalium

Gene

Acc. no.

Gene a

Identity b

Similarity b

Gene c

Identity b

Similarity b

Function

Activity

(%)

(%)

(%)

(%)

rsuA (probe)

P33918

HI1243

58

74

-

known

16S U516 -> [Psi]

yciL

P37765

HI1199

72

82

-

predicted

tRNA or 23S U -> [Psi]

yjbC d

P32684

-

-

predicted

tRNA or 23S U -> [Psi]

HI694

predicted

tRNA or 23S U -> [Psi]

rluA (probe)

P39219

HI617

62

75

-

known

23S U746, tRNA U32 -> [Psi]

yfiI

P33643

HI176

72

85

MG370 d

(26)

(51)

predicted

23S U2580 or 23S U(s) domain IV -> [Psi](s)

yceC

P23851

HI412

72

85

MG209 d

(30)

(55)

predicted

23S U2580 or 23S U(s) domain IV -> [Psi](s)

f260

CO05945

HI1435

59

70

-

predicted

tRNA or 23S U -> [Psi]

HI42

-

predicted

tRNA or 23S U -> [Psi]

a The H.influenzae ORFs are denoted according to Fleischmann et al. (1). b The percentages of identity and similarity of the corresponding amino acid sequence to the E.coli ortholog are calculated using the GAP program from the GCG package, which aligns sequences using the Needleman-Wunsch algorithm. The ORFs were truncated at the N- and C-terminal ends as shown in Figure 2 before determining the identity and similarity. c The M.genitalium ORFs are denoted according to (2). d The assignment of ORFs MG209 and MG370 as an orthologs of yceC and yfiI is speculative (see text).

Table 4 . 2'- O methyltransferases
E.coli

H.influenzae

M.genitalium

Gene

Acc. no.

Gene a

Identity b (%)

Similarity b (%)

Gene c

Identity b (%)

Similarity b (%)

Function

Activity

spoU (probe)

P19396

-

-

known

tRNA G18 -> Gm

yfiF

P33635

HI424

38

60

-

predicted

tRNA, 16S or 23S N -> Nm

yjfH

P39290

HI860

72

85

MG252

27

47

predicted

23S G2251 -> Gm

yibK

P33899

HI766

76

83

MG346

39

60

predicted

23S U2552 -> Um

lasT

P37005

HI380

32

52

-

predicted

tRNA, 16S or 23S N -> Nm

a The H.influenzae ORFs are denoted according to Fleischmann et al. (1). b The percentages of identity and similarity of the corresponding amino acid sequence to the E.coli ortholog are calculated using the GAP program from the GCG package, which aligns sequences using the Needleman-Wunsch algorithm. The ORFs were truncated at the N- and C-terminal ends as shown in Figure 2 before determining the identity and similarity. c The M.genitalium ORFs are denoted according to Fraser et al. (2).

The rluA subfamily has five homologs in the completely sequenced H.influenzae genome, four of which have orthologs in E.coli . Although an E.coli ortholog of the fifth rluA homolog found in H.influenzae has not been found, it may exist in the 26% of the E.coli chromosome that remains to be sequenced. Two rluA homologs were identified in M.genitalium ; MG209 and MG370. The ORF MG209 is the more conserved of these two and is slightly more related to the E.coli ORF yceC than to the other rluA homologous ORFs (Figs 1 B and 3 ; Table 3 ); however the sequence conservation is not strong enough to assign clear orthology.

The rsuA sub-family has three homologs in E.coli , three in H.influenzae and none in M.genitalium . Two of the E.coli ORFs, yciL and rsuA , have apparent orthologs in H.influenzae . The remaining rsuA homolog in E.coli ( yjbC ) and in H.influenzae (HI42) are not orthologous to each other and have no apparent orthologs in other sequences examined here (Figs 1 B and 3; Table 3 ). The similarity of yceC , yfiI and yjbC to rluA and rsuA has previously been noted ( 4 ).

An alignment was made of the amino acid sequences of the six rsuA homologs from E.coli and H.influenzae and the eleven rluA homologs from E.coli , H.influenzae and M.genitalium . Upon aligning the two subgroups of [Psi] synthases, three conserved sequence motifs (motif I: 1-K-P-x 3 -2, motif II: R-L-D-x 2 -T-x-G-2-2-2-h and motif III: G-5-x 2 -1-2-R) were found in both sets of [Psi] synthases (Fig. 3 ).

A total of 17 [Psi]s are known to be present in E.coli RNA. Mature tRNA has seven [Psi] nucleotides. Three enzymes-RluA, TruA and TruB-which catalyze five of the seven tRNA modifications have been characterized (Table 1 ). The enzymes catalyzing the two remaining modifications in tRNA have not been identified. 16S RNA has a single [Psi] at nucleotide 516 which is formed by RsuA ( 17 ). 23S rRNA has nine [Psi]s which, with one exception ([Psi]955) are located at the peptidyl transferase center ( 19 ). RluA catalyzes formation of [Psi]746, but enzymes for the eight remaining [Psi]s in 23S rRNA ([Psi]955, [Psi]1911, m 3 [Psi]1915, [Psi]1917, [Psi]2457, [Psi]2504, [Psi]2580 and [Psi]2605) have not been identified. Thus, there are a total of ten [Psi] nucleotides for which the modifying enzyme have not been identified. We have identified five new ORFs in E.coli predicted to encode RNA [Psi] synthases. The difference in numbers-ten [Psi] nucleotides versus five [Psi] synthases-may be explained by (a) enzymes having multiple substrates as with TruA and RluA, (b) E.coli genes not yet sequenced, or (c) genes not related to the major rsuA / rluA branch of [Psi] synthases as is the case for truA and truB .

We assume that ortholog pairs of enzymes from E.coli and H.influenzae having the highest homology as well as having close homologs present in M.genitalium , will catalyze those modified nucleotides present at the most conserved locations. The most conserved [Psi] nucleotides found in ribosomal RNA are [Psi]2580 located in domain V of 23S rRNA, and two [Psi]s clustered in domain IV ([Psi]1915 and [Psi]1917) ( 20 , 21 ). Thus, we predict that one of the two E.coli ORFs with the most conserved orthologs ( yceC and yfiI ) encodes the enzyme that catalyzes formation of [Psi]2580 and that the other [Psi] synthase encodes the enzyme that catalyzes formation of one or both of the conserved [Psi]1915 and [Psi]1917 in domain IV (Table 3 ). It is noteworthy that TruA can modify up to three closely spaced Us in the anticodon arm of some tRNAs, thus providing precedent for multiple [Psi] modifications at closely spaced positions by a single enzyme.

2 ' - O Methyltransferases

The sequences of three genes encoding 2'- O methyltransferases are available: spoU in E.coli , encoding the tRNA Gm18 methyltransferase (C. Gustafsson, unpublished), tsr in Streptomyces azureus encoding the thiostreptone resistance marker 23S rRNA Am1067 methyltransferase ( 22 ), and PET56 in yeast, encoding the 23S rRNA Gm2251 methyltransferase ( 23 ) (Table 1 ). The amino acid sequences of these three gene products were used as probes for iterative searching of the genome sequences of E.coli , H.influenzae and M.genitalium to identify other ORFs encoding enzymes that catalyze methylation of the 2' hydroxyl of the ribose in RNA.

Ten previously uncharacterized ORFs were found in the search which, after assignment of orthologous pairs, corresponded to four new 2'- O methyltransferases. The four previously uncharacterized ORFs in E.coli all had orthologs present in H.influenzae and two of which ( yjfH and yibK ) also had orthologs present in M.genitalium . The two ORFs in M.genitalium were orthologs of the two most homologous gene pairs in E.coli and H.influenzae . The probe spoU did not have an ortholog in either H.influenzae or M.genitalium (Table 4 ; Fig. 1 C).

An alignment made of the three known and four newly identified 2'- O methyltransferases revealed three motifs found in all of the ORFs (Fig. 4 ). One of the motifs, motif II (h-2-h-G-x-E-x 2 -G-2), consists of a series of bulky aliphatic amino acid residues followed by two conserved glycines, resembling an AdoMet binding motif ( 12 ). Two additional conserved motifs were found, motif I (3-x-N-x-G-x 3 -R) located at the N-terminus of the sequences and motif III (2-P-x 6 -S-2-N-2) located at the C-terminus (Fig. 4 ).

A total of seven 2'- O modified nucleotides have been found in E.coli RNA. One is in 16S rRNA (m 4 Cm1402) and three are in 23S rRNA (Gm2251, Cm2498 and Um2552). There are also three 2'- O modified nucleotides in tRNA; however, two of the three, Um32 and Cm32, are both pyrimidine nucleotides and occur at the same position in different tRNAs, and are likely to be catalyzed by the same enzyme. Thus, we propose that there are six 2'- O methyltransferases in E.coli which catalyze the seven RNA modifications. One, spoU , has been previously identified and we have here identified four previously uncharacterized ORFs as putative 2'- O methyltransferases. The remaining ORF may be (a) in the part of the E.coli genome not sequenced yet, (b) attributed to one enzyme having multiple target substrates, or (c) part of another 2'- O methyltransferase family. The similarity of lasT , yibK and yfiF to spoU has previously been noted ( 4 ).

Since the two most highly conserved putative 2'- O methyltransferases, yibK and yjfH , are the only ORFs within this family present in M.genitalium , we suggest they encode the enzymes catalyzing the 2'- O methylations 23S Gm2251 and Um2552, which are the only modified nucleotides found in all organisms so far analyzed ( 20 ). Since the yibK ortholog set is phylogenetically more closely related to the guanosine methylase spoU (Fig. 1 C), it probably encodes a guanosine methylase. Thus, we propose that yibK encodes the 23S rRNA Gm2251 methyltransferase and yjfH consequently encodes the 23S rRNA Um2552 methyltransferase.

We are currently experimentally testing the functional predictions described in this paper. So far, we have cloned and expressed three of the E.coli ORFs described above; ygcA, yceC and yfiF . Although the specific bases modified have not yet been identified, we have determined that each of the three enzymes encoded by these ORFs does indeed catalyze the formation of the predicted RNA modifications.

ACKNOWLEDGEMENT

This work was supported by grant GM-51232 from the National Institute of Health.

REFERENCES

1 Fleischmann,R.D., Adams,M.D., White,O., Clayton,R.A., Kirkness,E.F., Kerlavage,A.R., Bult,C.J., Tomb,J.F., Dougherty,B.A., Merrick,J.M. et al. (1995) Science, 269, 496-512.

2 Fraser,C.M., Gocayne,J.D., White,O., Adams,M.D., Clayton,R.A., Fleischmann,R.D., Bult,C.J., Kerlavage,A.R., Sutton,G., Kelley,J.M. et al. (1995) Science, 270, 397-403.

3 Wahl,R., Rice,P., Rice,C.M. and Kröger,M. (1994) Nucleic Acids Res., 22, 3450-3455.

4 Koonin,E.V., Tatusov,R.L. and Rudd,K.E. (1995) Proc. Natl Acad. Sci. USA, 92, 11921-11925

5 Björk,G.R. (1996) In Neidhardt,F.C. (ed.), Escherichia coli and Salmonella. Cellular and Molecular Biology. American Society for Microbiology, Washington, DC, pp. 861-886.

6 Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) J. Mol. Biol., 215, 403-410.

7 Pearson,W.R. and Lipman,D.J. (1988) Proc. Natl Acad. Sci. USA, 85, 2444-2448.

8 Genetics Computer Group, (1994). Version 8, Madison, Wisconsin, USA.

9 Björk,G.R. and Isaksson,L.A. (1970) J. Mol. Biol., 51, 83-100.

10 Gustafsson,C., Lindström,P.H., Hagervall,T.G., Esberg,K.B. and Björk,G.R. (1991) J. Bacteriol., 173, 1757-1764.

11 Sprinzl,M., Dank,N., Nock,S. and Schön,A. (1991) Nucleic Acids Res. 19, 2127-2171. MEDLINE Abstract

12 Kagan,R.M. and Clarke,S. (1994) Arch. Biochem. Biophys., 310, 417-427.

13 Cheng,X., Kumar,S., Posfai,J., Pflugrath,J.W. and Roberts,R.J. (1993) Cell, 74, 299-307. MEDLINE Abstract

14 Kealey,J.T. and Santi,D.V. (1991) Biochemistry, 30, 9724-9728.

15 Bruni,C.B., Colantuoni,V., Sbordone,L., Cortese,R. and Blasi,F. (1977) J. Bacteriol., 130, 4-10.

16 Nurse,K., Wrzesinski,J., Bakin,A., Lane,B.G. and Ofengand,J. (1995) RNA, 1, 102-112.

17 Wrzesinski,J., Bakin,A., Nurse,K., Lane,B.G. and Ofengand,J. (1995) Biochemistry, 34, 8904-8913.

18 Wrzesinski,J., Nurse,K., Bakin,A., Lane,B.G. and Ofengand,J. (1995) RNA, 1, 437-448.

19 Brimacombe,R., Mitchell,P., Osswald,M., Stade,K. and Bochkariov,D. (1993) FASEB J., 7, 161-166.

20 Sirum-Connolly,K., Peltier,J.M., Crain,P.F., McCloskey,J.A. and Mason,T.L. (1995) Biochimie, 77, 30-39.

21 Bakin,A., Lane,B.G. and Ofengand,J. (1994) Biochemistry, 33, 13475-13483.

22 Bibb,M.J., Bibb,M.J., Ward,J.M. and Cohen,S.N. (1985) Mol. Gen. Genet., 199, 26-36.

23 Sirum-Connolly,K. and Mason,T.L. (1993) Science, 262, 1886-1889.


Return

* To whom correspondence should be addressed + Present address: Kosan Biosciences, Inc., 1450 Rollins Road, Burlingame, CA 94010, USA
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J. Biol. Chem.Home page
M. S. Dunstan, P. C. Hang, N. V. Zelinskaya, J. F. Honek, and G. L. Conn
Structure of the Thiostrepton Resistance Methyltransferase{middle dot}S-Adenosyl-L-methionine Complex and Its Interaction with Ribosomal RNA
J. Biol. Chem., June 19, 2009; 284(25): 17013 - 17020.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
S. Hur, R. M. Stroud, and J. Finer-Moore
Substrate Recognition by RNA 5-Methyluridine Methyltransferases and Pseudouridine Synthases: A Structural Perspective
J. Biol. Chem., December 22, 2006; 281(51): 38969 - 38973.
[Full Text] [PDF]


Home page
J. Biol. Chem.Home page
K. Watanabe, O. Nureki, S. Fukai, Y. Endo, and H. Hori
Functional Categorization of the Conserved Basic Amino Acid Residues in TrmH (tRNA (Gm18) Methyltansferase) Enzymes
J. Biol. Chem., November 10, 2006; 281(45): 34630 - 34639.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
I. Behm-Ansmant, S. Massenet, F. Immel, J. R. Patton, Y. Motorin, and C. Branlant
A previously unidentified activity of yeast and mouse RNA:pseudouridine synthases 1 (Pus1p) on tRNAs
RNA, August 1, 2006; 12(8): 1583 - 1593.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
G. N. BASTUREA, K. E. RUDD, and M. P. DEUTSCHER
Identification and characterization of RsmE, the founding member of a new RNA base methyltransferase family
RNA, March 1, 2006; 12(3): 426 - 434.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
M. G. Goll, F. Kirpekar, K. A. Maggert, J. A. Yoder, C.-L. Hsieh, X. Zhang, K. G. Golic, S. E. Jacobsen, and T. H. Bestor
Methylation of tRNAAsp by the DNA Methyltransferase Homolog Dnmt2
Science, January 20, 2006; 311(5759): 395 - 398.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
M.-H. RENALIER, N. JOSEPH, C. GASPIN, P. THEBAULT, and A. MOUGIN
The Cm56 tRNA modification in archaea is catalyzed either by a specific 2'-O-methylase, or a C/D sRNP
RNA, July 1, 2005; 11(7): 1051 - 1063.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
K. Watanabe, O. Nureki, S. Fukai, R. Ishii, H. Okamoto, S. Yokoyama, Y. Endo, and H. Hori
Roles of Conserved Amino Acid Sequence Motifs in the SpoU (TrmH) RNA Methyltransferase Family
J. Biol. Chem., March 18, 2005; 280(11): 10368 - 10377.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. M. Bujnicki, M. Feder, C. L. Ayres, and K. L. Redman
Sequence-structure-function studies of tRNA:m5C methyltransferase Trm4p and its relationship to DNA:m5C and RNA:m5U methyltransferases
Nucleic Acids Res., April 30, 2004; 32(8): 2453 - 2463.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
Y. Kaya, M. Del Campo, J. Ofengand, and A. Malhotra
Crystal Structure of TruD, a Novel Pseudouridine Synthase with a New Protein Fold
J. Biol. Chem., April 30, 2004; 279(18): 18107 - 18110.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
C. J. SPEDALIERE and E. G. MUELLER
Not all pseudouridine synthases are potently inhibited by RNA containing 5-fluorouridine
RNA, February 1, 2004; 10(2): 192 - 199.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
M. DEL CAMPO, J. OFENGAND, and A. MALHOTRA
Crystal structure of the catalytic domain of RluD, the only rRNA pseudouridine synthase required for normal growth of Escherichia coli
RNA, February 1, 2004; 10(2): 231 - 239.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. T. Madsen, J. Mengel-Jorgensen, F. Kirpekar, and S. Douthwaite
Identifying the methyltransferases for m5U747 and m5U1939 in 23S rRNA using MALDI mass spectrometry
Nucleic Acids Res., August 15, 2003; 31(16): 4738 - 4746.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
H. Hori, S. Kubota, K. Watanabe, J.-M. Kim, T. Ogasawara, T. Sawasaki, and Y. Endo
Aquifex aeolicus tRNA (Gm18) Methyltransferase Has Unique Substrate Specificity: tRNA RECOGNITION MECHANISM OF THE ENZYME
J. Biol. Chem., June 27, 2003; 278(27): 25081 - 25090.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
Y. KAYA and J. OFENGAND
A novel unanticipated type of pseudouridine synthase with homologs in bacteria, archaea, and eukarya
RNA, June 1, 2003; 9(6): 711 - 721.
[Abstract] [Full Text] [PDF]


Home page
Genes Dev.Home page
A. K. Hopper and E. M. Phizicky
tRNA transfers to the limelight
Genes & Dev., January 15, 2003; 17(2): 162 - 180.
[Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. Anantharaman, E. V. Koonin, and L. Aravind
Comparative genomics and evolution of proteins involved in RNA metabolism
Nucleic Acids Res., April 1, 2002; 30(7): 1427 - 1464.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
S. Agarwalla, J. T. Kealey, D. V. Santi, and R. M. Stroud
Characterization of the 23 S Ribosomal RNA m5U1939 Methyltransferase from Escherichia coli
J. Biol. Chem., March 8, 2002; 277(11): 8835 - 8840.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
J. M. Bujnicki and L. Rychlewski
In silico identification, structure prediction and phylogenetic analysis of the 2'-O-ribose (cap 1) methyltransferase domain in the large structural protein of ssRNA negative-strand viruses
Protein Eng. Des. Sel., February 1, 2002; 15(2): 101 - 108.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
J. M. Lovgren and P. M. Wikstrom
The rlmB Gene Is Essential for Formation of Gm2251 in 23S rRNA but Not for Ribosome Maturation in Escherichia coli
J. Bacteriol., December 1, 2001; 183(23): 6957 - 6960.
[Abstract] [Full Text] [PDF]


Home page
FASEB J.Home page
J. M. BUJNICKI
Phylogenomic analysis of 16S rRNA:(guanine-N2) methyltransferases suggests new family members and reveals highly conserved motifs and a domain structure similar to other nucleic acid amino-methyltransferases
FASEB J, November 1, 2000; 14(14): 2365 - 2368.
[Abstract] [Full Text]


Home page
Proc. Natl. Acad. Sci. USAHome page
Y. Liu and D. V. Santi
m5C RNA and m5C DNA methyl transferases use different cysteine residues as catalysts
PNAS, July 18, 2000; 97(15): 8263 - 8265.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y.-i. Watanabe and M. W. Gray
Evolutionary appearance of genes encoding proteins associated with box H/ACA snoRNAs: Cbf5p in Euglena gracilis, an early diverging eukaryote, and candidate Gar1p and Nop10p homologs in archaebacteria
Nucleic Acids Res., June 15, 2000; 28(12): 2342 - 2352.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
V. Ramamurthy, S. L. Swann, J. L. Paulson, C. J. Spedaliere, and E. G. Mueller
Critical Aspartic Acid Residues in Pseudouridine Synthases
J. Biol. Chem., August 6, 1999; 274(32): 22225 - 22230.
[Abstract] [Full Text] [PDF]


Home page
Microbiol. Mol. Biol. Rev.Home page
M. K. B. Berlyn
Linkage Map of Escherichia coli K-12, Edition 10: The Traditional Map
Microbiol. Mol. Biol. Rev., September 1, 1998; 62(3): 814 - 984.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
J. Conrad, D. Sun, N. Englund, and J. Ofengand
The rluC Gene of Escherichia coli Codes for a Pseudouridine Synthase That Is Solely Responsible for Synthesis of Pseudouridine at Positions 955, 2504, and 2580 in 23 S Ribosomal RNA
J. Biol. Chem., July 17, 1998; 273(29): 18562 - 18566.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
I. Ansmant, Y. Motorin, S. Massenet, H. Grosjean, and C. Branlant
Identification and Characterization of the tRNA:Psi 31-Synthase (Pus6p) of Saccharomyces cerevisiae
J. Biol. Chem., September 7, 2001; 276(37): 34934 - 34940.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (148K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (87)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Gustafsson, C
Right arrow Articles by Santi, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gustafsson, C
Right arrow Articles by Santi, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?