ABSTRACT
The sequences and structures of RNase P RNAs of some Gram-positive bacteria, e.g.
Bacillus subtilis
, are very different than those of other bacteria. In order to expand our understanding of the structure and evolution of RNase P RNA in Gram-positive bacteria, gene sequences encoding RNase P RNAs from 10 additional
species from this evolutionary group have been determined, doubling the number of sequences available for comparative analysis. The enlarged data set allows refinement of the
secondary structure model of these unusual RNase P RNAs and the identification
of potential tertiary interactions between P10.1 and L12, and between L5.1 and
L15.1. The newly-obtained sequences suggest that RNase P RNA underwent an abrupt, dramatic
restructuring in the ancestry of the low-G+C Gram-positive bacteria after the divergence of the branches leading to
the `
Clostridia
and relatives' and the remaining low-G+C Gram-positive species. The unusual structures of the RNase P RNAs of
Mycoplasma hyopneumoniae
and
M.floccularre
are apparently derived from RNAs with
Bacillus
-like structure rather than from intermediate, partially restructured ancestral RNAs. The structure of the RNase P RNA from the photosynthetic
Heliobacillus mobilis
supports the relationship of this specie with
Bacillus
and
Staphylococcus
rather than the `
Clostridia
and relatives' as suggested by the sequences of their small-subunit ribosomal RNAs.
Ribonuclease P (RNase P) is the endoribonuclease responsible for the removal of
leader sequences from precursor tRNAs (see refs
1
and
2
for review). In bacteria, Archaea, Eucarya and mitochondria [but apparently not
chloroplasts of vascular plants (
3
)], RNase P is a ribonucleoprotein. In all bacteria investigated, the RNA
component of RNase P is catalytically active
in vitro
; i.e. it is a ribozyme.
The first two bacterial RNase P RNA sequences determined were those of
Escherichia coli
(
4
) and
Bacillus subtilis
(
5
), and the earliest reliable secondary structures were determined from the
comparative analysis of these sequences and those of a handful of each of their
relatives (
6
). The expectation at the time was that additional sequences obtained from
organisms from other phylogenetic groups of bacteria would be about as
different from one another in sequence and secondary structure as are those of
E.coli
and
B.subtilis
. This was not the case; RNase P RNA sequences are now available from all of the
11 major bacterial phylogenetic lineages, and except for those of
Bacillus
and its relatives, all of these RNAs are similar to those of
E.coli
and relatives (
1
,
7
). It was fortuitous that the first two available bacterial RNase P RNA sequences represented each of the major structure classes.
The common bacterial RNase P RNA secondary-structure class (which for the purpose of discussion we term `type A', for
Clostridium sporogenes
,
Lactobacillus acidophilus
,
Micrococcus luteus
,
Staphylococcus aureus
,
Streptococcus faecalis
and
Streptococcus faciem
were grown in trypticase soy broth (Difco) at 30oC (all cultures from the Indiana University stock culture collection).
Clostridium innocuum
(ATCC 14501) was grown anaerobically in beef liver infusion broth (ATCC medium
38) at 37oC. Cell paste from
Eubacterium thermomarinus
strain ES2 was a gift from Robert Kelly (North Carolina State University). DNAs
were purified as previously described (
13
).
Heliobacillus mobilis
and
Heliobacterium chlorum
DNAs were gifts from Howard Gest and Carl Bauer (Indiana University) and
Acholeplasma laidlawii
DNA was a gift from Carl Woese (University of Illinois).
Southern analysis of
Bam
HI and
Pst
I restriction endonuclease-digested genomic DNAs was performed as previously described (
13
). The probes used were partially-hydrolyzed, uniformly
32
P-labeled run-off transcripts of the
B.subtilis
or
Streptomyces bikiniensis
RNase P RNA genes prepared as previously described (
13
).
Polymerase chain reactions (
14
) were performed using buffer containing 50 mM KCl, 10 mM Tris-HCl (pH 8.3), 1.5 mM each dGTP, dCTP, dATP and dTTP, 0.05% Nonidet P40
and 200 ng each primer oligonucleotide. In some cases, reactions contained 5%
acetamide. The primers used were 59FBam (CGGGATCCGIIGAGGAAAGTCCIIGC) and either
347REco (CGGAATTCRTAAGCCGGRTTCTGT) or 347RXba (GCTCTAGATAAGC- CRYGTTYTGT). The amplifications included an initial 2 min 94oC incubation, 30 or 40 amplification cycles (92oC for 1.5 min, 50oC for 1.5 min, 72oC for 0.5 min each cycle), and a final 7 min 72oC incubation.
The genes encoding RNase P RNA from
C.sporogenes
,
E.thermomarinus
,
H.mobilis
,
H.chlorum
and
M.luteus
were obtained by PCR amplification using primers 59FBam and 347REco. The genes
encoding RNase P RNA from
A.laidlawii
C.innocuum
,
L.acidophilus
,
S.aureus
,
S.faecalis
and
S.faciem
were obtained in the same way using primers 59FBam and 347RXba. Restriction
endonuclease-digested PCR products were separated by electrophoresis in 3% low-melting agarose gels (NuSieve GTG agarose, FMC, Rockland, ME).
Agarose plugs containing the DNA bands were excised, melted (65oC), and used directly in ligation reactions containing restriction endonuclease-digested pBluescript KS
+
DNA (Stratagene). In the case of
C.innocuum
, the amplified DNA was digested only with
Xba
I and cloned into
Xba
I/
Eco
RV-digested vector DNA. In the case of
H.chlorum
, only a small fragment of the gene was cloned due to the presence of an
Eco
RI site in the amplified DNA.
The nucleotide sequences encoding RNase P RNAs were determined from double-stranded plasmid DNAs by the dideoxy chain termination method (
15
) with Sequenase version 2.0 (Amersham, Arlington Heights, IL) using M13
universal, M13 reverse, 59FBam, 347REco, 347RXba, 174F (AGGGTGAAANGGTGSGGTAAGAG) and 174R (CTCTTACCSCACCNTTTCACCCT) primers. 7-Deaza-dGTP was used to alleviate band compressions in sequencing gels.
RNase P RNA sequences were aligned manually using SeqApp (Don Gilbert, Indiana
University). Comparative analysis of secondary structure was performed as
previously described was identified using Covariation (
16
). Sequences derived from PCR primers were excluded from the analysis.
Phylogenetic trees based on the RNase P RNA sequences or those of small-subunit ribosomal RNAs (using sequence alignments from the Ribosomal Database project) were generated using the DeSoete algorithm in GDE (
17
). Trees of the type A and type B RNase P RNAs were calculated separately and
merged with a minimal representative tree. All RNase P RNA sequences,
alignments and secondary structures are available electronically from the
Ribonuclease P Database (http://www.mbio.ncsu.edu/RNaseP/ ) (
18
). The accession numbers for these sequences are U64877-U64887.
The Gram-positive bacteria are comprised of two major phylogenetic groups; the `low G+C' group, of which
Bacillus
is a member, and the `high G+C' group, which includes
Streptomyces
. Southern analyses of genomic DNAs were used to qualitatively assess sequence similarity between the type A
S.bikiniensis
and type B
B.subtilis
RNase P RNAs and those of various other members of the Gram-positive bacteria (Fig.
1
). DNA from
C.sporogenes
,
E.thermomarinus
and
M.luteus
hybridized to the type A
S.bikiniensis
probe, whereas DNA from the remaining species (
S.aureus, S.faecalis, L.acidophilus, M.fermentans, A.laidlawii
and
H.chlorum
) hybridized to the type B
B.subtilis
probe (Fig.
1
). Positive hybridization by these probes was mutually exclusive. The probe hybridized in each case to single bands of genomic DNA except where digests appeared to be incomplete, suggesting that RNase P RNA is encoded by single-copy genes in these organisms.
The newly obtained Gram-positive sequences were aligned with the database of bacterial RNase P
RNAs, and secondary structures were inferred by comparative analysis (Fig.
2
) (
19
). All of the RNase P genes isolated from DNA that hybridized to the
S.bikiniensis
probe were found to be type A RNase P RNAs; those that hybridized to the
B.subtilis
probe were found to be type B RNase P RNAs. The type B RNAs contained all of
the secondary structural elements unique to the
Bacillus
RNAs (P5.1, P10.1, P15.1, P15.2), and lacked all of those elements absent in
the
Bacillus
RNAs (P6, P13, P14, P16, P17, P18). The model for the secondary structure of
the type B RNase P RNA is supported in detail by base-base covariation in 72% of the pairings proposed in the consensus model
(Fig.
3
). All helices except the P10 and P11 `minihelices' are supported by the
occurrence of covariation of at least two base-pairings. Helices P9, P10.1, P12 and P19, which are minimally present or
absent in the consensus, are well-supported within those groups in which they occur; of the 113 base-pairings proposed in the
B.subtilis
RNase P RNA, 93 (82%) are specifically supported by phylogenetic covariation.
Sequence variation within the context of this secondary structure is similar
[assessed by the entropy coefficient H
x
(
20
)] to that found in other phylogenetic groups of similar evolutionary depth;
having abruptly and dramatically changed in sequence and structure, these RNAs apparently resumed evolutionary divergence at rates similar to those of RNase P RNAs in other evolutionary groups.
The evolutionary relationships of the bacterial species were estimated by
analysis of the RNase P RNA sequence alignment. The resulting phylogenetic tree
(Fig.
1
) is consistent with phylogenetic relationships inferred on the basis of small-subunit ribosomal RNAs (
17
) with the single exception of the placement of the photosynthetic Gram-positive bacteria. The photosynthetic Gram-positive bacteria have been considered a separate lineage, distinct
from both the low- and high-G+C subdivisions of Gram-positive bacteria, based primarily on preliminary small-subunit nuclease T1 catalog data and phenotypic
characteristics (
24
). According to more recent small-subunit rRNA sequence comparisons (
17
), the photosynthetic species would be members of the `Clostridia and relatives' group within the low G+C Gram-positive bacteria (of which
C.sporogenes
and
E.thermomarinus
are members), inconsistent with its type B RNase P RNA. In the tree based on
RNase P RNA sequences,
H.mobilis
appears as a distinct, rapidly-evolving branch within the `
Bacillus- Lactobacillus-Streptococcus
group' of the low G+C subdivision. The sequence similarity, estimated phylogenetic relationships, and secondary structure details of the
H.mobilis
RNase P RNA all show that this organism is a member of the group of organisms
with type B RNase P RNAs rather than type A RNase P RNAs, as was suggested (but
not resolved) by ribosomal RNA sequence analysis.
The sequence and structure of RNase P RNA underwent a dramatic restructuring in
the common ancestor of the `
Bacillus-Lactobacillus-Streptococcus
'
and `
Mycoplasma
and relatives' groups of the low G+C Gram-positive bacteria (Fig.
1
). This alteration was apparently abrupt, at least from the broad perspective of
bacterial evolution; there is no evidence of partially restructured RNAs
amongst the diverse sequences examined. The RNase P RNAs from
Mycoplasma
spp. [except that of
M.genitalium
,
taken from the complete genome sequence (
25
)] are distinct from those of the other
Bacillus
-like RNAs (
26
), but seem to have diverged from a more
Bacillus
-like ancestry rather than from a structural intermediate between the
ancestral and
Bacillus
-like RNA classes. The
C.innocuum
,
A.laidlawii
and
M.genitalium
RNAs resemble those of
Bacillus
rather than those of the previously-determined
Mycoplasma
spp. despite the phylogenetic affiliation of these organisms. The
Mycoplasma
spp. RNase P RNAs (except that of
M.genitalium
) are distinct in sequence and lack P10.1, one of the unique elements of the
type B RNAs; the sequence and presence of P10.1 in the
C.innocuum
,
A.laidlawii
and
M.genitalium
RNAs imply that the unusual features of the other
Mycoplasma
RNAs are derived secondarily from a
Bacillus
-like RNA.
It has previously been suggested that the absence in type B RNase P RNAs of P6,
a helix that creates a pseudoknot in type A RNAs, is compensated by the
presence of P5.1, which could replace both P6 and P17 (also absent in type B
RNAs) in the architecture of the RNAs (
12
). However, the replacement of P6 and the associated P16 and P17 in the
E.coli
RNA with the
B.subtilis
P5.1 does not result in restoration of the biochemical phenotype associated
in vitro
with the loss of P6/P16/P17 (
27
). It seems likely that P5.1 docks to an additional element, the analog of P16
in type A RNAs, to complete the structural replacement; this analog has yet to
be identified. The comparative data suggest that the most likely candidate for
this structural analog of P16 is P15.1; P5.1 and P15.1 are the only structural
elements that are present in all RNase P RNAs that lack the P6/P16/P17 element.
Although the number of type B RNase P RNA sequences is small for an analysis of
tertiary contacts, there is significant covariation between nucleotide A71 (
B.subtilis
numbering) in L5.1 and nucleotide U283 in L15.1 (Fig.
2
), consistent with interaction between these loop's sequences.
The redefined secondary structure of P10.1 is interesting because it contains a
structure and sequence motif that has the potential to form a tertiary
interaction with GAAA tetraloops. This tertiary interaction was originally
identified in group I intron RNAs (
22
), and has been confirmed biochemically in both group I and group II intron RNAs
(
23
), although the physical nature of the interaction has yet to be determined. The
presence of the sequence motifs and the possibility of the tertiary interaction
in the type B RNase P RNAs has already been pointed out (
28
). The most likely GAAA tetraloops in these RNAs to form the interaction is L12
because: (i) the length of P12 is invariant in all of the type B RNAs in which
the P10.1 motif is present (suggesting a length-dependent loop contact); (ii) these loops are invariably the requisite
GAAA, never an alternative GNRA or other loop sequence in those RNAs with the
appropriate P10.1 sequence; and (iii) in those type B RNAs that lack P10.1 (
M.flocculara
,
M.hyopneumoniae
and
M.fermentans
), both the length and sequences of L12 are no longer conserved (i.e. the
presence of the two required motifs covary phylogenetically). The
Bacillus brevis
RNA (
6
) lacks both sequence motifs, but has the appropriate motifs to form an
alternative tertiary interaction involving the same loop sequence (in this case
GCGA) with adjacent A=U and G=C base pairs, in the appropriate location in
P10.1. The
C.innocuum
RNA has the P12 loop GAAA sequence but lacks the P10.1 motif. The region of
P10.1 where the interaction would presumably take place are unusual in
C.innocuum
, however, in that the region is predominated by G[middot]U pairs. It seems likely that the tertiary contact is maintained in the
C.innocuum
RNA by unusual helix geometry in this helix that may mimic the structure of the
homologous region in other RNAs. In both
B.brevis
and
C.innocuum
, the length of P12 is the same (8 base pairs) as in those RNAs that can form
the `standard' group I intron-like tertiary interaction.
The nature of the evolutionary pressure that resulted in the dramatic alteration of RNase P RNA structure in the low G+C Gram-positive bacteria is not clear. Although exhaustive comparisons are not available, no clear difference in catalytic activity, kinetic
properties, or substrate differentiation between the type A
E.coli
and type B
B.subtilis
RNase P RNAs have been demonstrated. Moreover, since their tertiary architectures seem to be equivalent and the
proteins are interchangeable, the alteration in RNase P RNA from the A to the B
type of structure may then be an evolutionary `neutral' mutation. One interesting apparent difference in tRNA biogenesis between the low G+C Gram-positive bacteria and most other bacteria is the encoding of the 3'-terminal CCA sequence. The 3'-CCA sequence is generally included in the RNA-encoding genes of most Bacteria, but in
B.subtilis
and
M.genitalium
the tRNA-encoding genes generally do not encode the 3'-CCA (how widespread this feature is among relatives of these
organisms is not known). For instance, in
Haemophilus influenza
(
29
) and
M.genitalium
(
25
), for which the genome sequences are available for complete comparison, 96% (54
of 56) and 6% (2 of 33), respectively, of tRNA-encoding genes include encoded 3'-CCA. However, it seems unlikely that the difference in RNase
P RNA structure results directly from the presence or absence of encoded 3'-CCA; 3' processing of pre-tRNAs, including the post-transcriptional addition of CCA if required,
usually precedes cleavage by RNase P (
30
). In any case, both the
B.subtilis
and
E.coli
RNase P RNAs respond similarly
in vitro
to alterations in the sequence or length of the 3'-terminus of substrate pre-tRNAs (
31
,
32
). It is also possible that the alteration in RNase P RNA structure is in
response to a change in a substrate other than pre-tRNA. In all bacteria examined, except
Bacillus
spp., the signal recognition particle (SRP) RNA is a 4.5S RNA that is
homologous only to the core domain of the much larger 7S SRP RNAs of Eucarya
(a.k.a. eucaryotes) and Archaea (a.k.a. archaebacteria) (
33
,
34
). Surprisingly, SRP RNA in
Bacillus
spp. is a large, 7S RNA similar to the eucaryal and archaeal RNAs (
35
). Both the 4.5S and 7S forms of bacterial SRP RNA are processed at their 5'-ends by RNase P (
36
,
37
). However, since
Mycoplasma
spp., which have type B RNase P RNAs, have typical 4.5S SRP RNAs (
33
), it is unlikely that the change in this substrate is responsible for the alteration in RNase P RNA structure.
Despite the differences in structure in the Gram-positive bacteria, phylogenetic trees based on these RNase P RNA sequences (Fig.
1
) are generally consistent with those based on small-subunit ribosomal RNA sequences, which are commonly used for estimating
phylogenetic relationships between organisms. Small-subunit rRNA analyses have not resolved the placement of the
photosynthetic Gram-positive bacteria; it was originally thought that these organisms formed a
group separate from either the high- or low-G+C subdivisions (termed the `photosynthetic subdivision') (
24
). More recently, these organisms have been placed with the `
Clostridia
and relatives', a group within the low G+C subdivision (
17
). Neither of these possibilities is consistent with the structure of the RNase
P RNA from
H.mobilis
(Fig.
2
) or the small sequence segment obtained from
H.chlorum
. Because the RNase P RNAs of
C.sporogenes
and
E.thermomarinus
(members of the clostridia and relatives subdivision) are of the ancestral type
A, and the
H.mobilis
and
H.chlorum
RNAs are of the more recently derived type B, the heliobacteria must be related
by ancestry to the
Bacillus-Lactobacillus-Streptococcus
and
Mycoplasma
and relatives evolutionary branches.
We thank Kevin Harrell, Forrest Hentz Jr, Mary Ellen Woods and Martina Vaskova-Zurovcova for technical assistance in the cloning and sequencing of the
RNase P RNA-encoding genes reported. This work was supported by NIH grant GM52894 to
JWB and NIH grant GM34527 to NRP.
*To whom correspondence should be addressed. Tel: +1 919 515 8803; Fax: +1 919
515 7867; Email: jwbrown@mbio.ncsu.edu
REFERENCES
Return

