ABSTRACT
The synthesis of release factor-2 (RF-2) in bacteria is regulated by a high efficiency +1 frameshifting
event at an in-frame UGA stop codon. The stop codon does not specify the termination of
synthesis efficiently because of several upstream stimulators for
frameshifting. This study focusses on whether the particular context of the
stop codon within the frameshift site of the
Escherichia coli
RF-2 mRNA contributes to the poor efficiency of termination. The context of
UGA in this recoding site is rare at natural termination sites in
E.coli
genes. We have evaluated how the three nucleotides downstream from the stop
codon (+4, +5 and +6 positions) in the native UGACUA sequence affect the
competitiveness of the termination codon against the frameshifting event.
Changing the C in the +4 position and, separately, the A in the +6 position significantly increased the termination signal strength at
the frameshift site, whereas the nucleotide in the +5 position had little
influence. The efficiency of particular termination signals as a function of
the +4 or +6 nucleotides correlates with how often they occur at natural
termination sites in
E.coli;
strong signals occur more frequently and weak signals are less common.
The mechanism used to decode translational stop signals during termination of
the synthesis of a polypeptide is different from that used to decode sense
codons during chain elongation. While the latter involves RNA:RNA interactions
between three bases of the tRNA and mRNA at the decoding site of the small
ribosomal subunit, the termination of protein synthesis involves interactions
between the mRNA and a protein decoding molecule, the release factor, in place
of the tRNA (
1
). Termination may involve further essential interactions between the release
factor and the ribosome (
2
and Pel,H.J., Rosenfeld,S. and Bolotin-Fukuhara,M., unpublished), and even between the release factor and the
peptidyl-tRNA at the adjoining site (
3
). Despite wide acceptance of the different mechanisms mediating the decoding of
sense and stop codons, protein synthesis termination signals have generally
still been viewed as the originally proposed triplets: UAA, UGA and UAG (
4
,
5
). However, nucleotides both upstream and downstream of the codon may contribute
to the termination signal. Restrictions in the upstream sequence could
influence the nature of potential interactions of RF with the last amino acids
and tRNA, and restrictions in the downstream sequence could reflect direct
interaction of the RF with the mRNA itself (
6
).
Early experiments on non-cognate or cognate competitive suppression of these stop codons hinted
that a region larger than the triplet codon might be important for the
termination signal, since the efficiency of suppression was influenced by
sequence context (
7
-
11
). Statistical analysis of the nucleotides surrounding natural stop codons (stop
signals found at the end of genes) in genes from a wide range of organisms
showed a strong bias in the nucleotides occurring in positions surrounding the
codon, and particularly in the +4 position (the stop codon being +1 to +3) (
12
,
13
). In
Escherichia coli
we have shown that the termination efficiency of stop codons
in vivo
is indeed determined by the nucleotide immediately following the termination
codon (+4) (
14
). The pattern is maintained in human cells where translation of an expressed
type I 5'-deiodinase mRNA, in which an internal UGA encodes selenocysteine
(Sec), has shown that the +4 nucleotide dramatically affects the competition
between termination and Sec incorporation at this recoding site (
15
). Thus in genetic systems from these two kingdoms the +4 nucleotide
significantly affects the efficiency by which a termination codon is decoded as
a stop signal. While the effect of the identity of the +4 nucleotide is
different between these prokaryotic and eukaryotic examples, in each case there
is agreement between stop signal strength and frequency of occurrence at
natural stop sites.
In addition, highly expressed genes use predominantly the strongest four-base stop signals (stop codon together with the following +4 nucleotide) (
14
,
15
), whereas stop signals at translational recoding sites (
16
) are mostly weak and are rarely used at natural termination sites (
14
,
15
). An example is UGAC which was determined to be the poorest termination signal
in
E.coli
(
14
). This weak tetranucleotide stop signal occurs at the sites of two recoding events in
E.coli
, selenocysteine incorporation in the formate dehydrogenase mRNA (
17
,
18
) and at the +1 frameshift site in the translation of RF-2 (
19
).
The fact that the native signal at the RF-2 frameshift site, UGAC, was found to be significantly weaker than the
other 11 four-base signals (
14
) raised the question of whether the effect was amplified in this particular
context by the surrounding nucleotides. The sequence following the UGA
termination triplet in the RF-2 frameshift site is CUA. Although suppression studies have not been
carried out for UGAC
UA
, it has been demonstrated that UGAC
UG
and UGAC
UC
are strongly suppressed compared with some other UGAC contexts (for example,
UGAC
AU
) (
20
). The occurrence of multiple CUA codons near the 5' end of a coding sequence perturbs translation of the mRNA, possibly via
a destabilising effect on the translational complex (
21
). This is likely to be because CUA is decoded by a rare tRNA (
22
); of 29 sense codons tested, CUA had the slowest rate of decoding (
23
).
In this study we have tested the proposal that CUA following a UGA termination
codon might be an especially bad context for termination. All possible
nucleotides were used in positions +4, +5 and +6 (3' to the UGA) of the RF-2 frameshift site to create new contexts and the relative
termination strengths of the resulting signals were determined.
pMALtm-c2 plasmid and maltose binding protein (MBP) antibody, restriction
enzymes and buffers, and T4 DNA ligase were purchased from New England Biolabs.
[[gamma]-
32
P]dATP and Hybond transfer membranes were obtained from Amersham. Nitrocellulose
transfer membranes were purchased from Schleicher and Schuell. Other chemical
reagents were purchased from Sigma.
Deoxyoligonucleotides were either synthesised on site using an Applied
Biosystems 380 B DNA synthesiser, or purchased from Macromolecular Resources,
Colorado State University. Plasmid DNA was isolated using a Wizardtm Miniprep DNA purification kit (Promega). Cloned DNA was sequenced using
a 373A AB1 Sequencer. Plasmids were electroporated into bacterial cells using
an Electro Cell Manipulator© 600 (BTX).
Gel electrophoresis and protein transfer were performed using BioRad Mini-PROTEAN II electrophoresis cells and BioRad Mini Trans-Blot electrophoretic transfer cell. A GS-670 imaging densitometer (BioRad) was used for laser
densitometry.
Statistical analyses of nucleotide sequences were performed on the set of
E.coli
data obtained from the 1995 release of the TransTerm Database (
24
). This database contains the sequence contexts around 3492
E.coli
stop signals for 100 nucleotides (nt) before and after the stop codon. Only
3433 of these sequences had stop signals identified as far as the +6 nucleotide. The programme, `count_signal', written by Mark Dalphin, was used to count the
frequencies of various nucleotide patterns, such as UGAC, in both the
termination position and in the 100 nt 3' to the stop codon. If the 100 nt included another open reading frame,
the counting stopped at the start of the open reading frame. The TransTerm
database (
24
) was used to document how many stop signals of a defined length and sequence
were present in any reading frame of the 100 nt of non-coding region immediately downstream of the natural termination sites of
genes recorded in the database. For example, there are 4506 UGAN sequences in
the non-coding regions of the recorded
E.coli
genes and 1494 (or 33%) are UGAU. Since there are 1081 UGAN sequences at natural
termination sites we would expect 358 to be UGAU (33% of the 1081 UGAN stop
signals). We looked for deviations from the `expected' value using a
normalisation scheme which we called `deviation':
Deviation = (Observed - Expected)/Expected
Bacteria were grown in Luria Broth, and ampicillin (amp) resistance (final
concentration 100 [mu]g/ml) was used to select bacteria containing plasmids. Protein expression
was induced from the P
tac
promotor with 1 mM media concentration of IPTG (
25
).
Escherichia coli
strain TG1 was used for primary cloning. Plasmids were subsequently
electroporated into
E.coli
strain FJU112 [[Delta](lac pro) gyrA ara recA56/
10
, F'lacI
Q1
] (
26
) for analysis of fusion proteins. Strain FJU112 has wild-type ribosomes and no suppressor tRNAs which could compete with
termination or frameshifting events in our translational termination/frameshift
assay.
Complementary deoxyoligonucleotides spanning the RF-2 frameshift window and containing UGACNA and UGACUN stop signal context
series were annealed and directionally cloned into the pMaltm polylinker at
Eco
RI and
Sal
I sites using standard recombination techniques (
25
). The plasmid with the natural stop context UGACUA had been constructed previously (
14
). The plasmids were introduced into
E.coli
strain TG1 by electroporation (2.5 kV, 5-6 ms). Cells were selected for amp resistance. Recombinant clones were
identified by colony hybridisation, using one of the oligonucleotides labelled with [[gamma]-
32
P]dATP as a probe. Positive transformants were screened for the presence of the
RF-2 frameshift window by inducing expression of the fusion proteins. MBP
fusion proteins were identified on a Western blot following electrophoresis.
Plasmid DNA was isolated and the sequence was confirmed. The plasmids were
electroporated into
E.coli
strain FJU112, fusion protein expression induced and the products analysed
immunologically following Western blotting using MBP as described (
14
). Proportions of frameshift and termination products were determined by laser
densitometry.
The RF-2 frameshift site has the sequence CUA in positions +4, +5 and +6,
following the stop codon UGA, as shown in Figure
1
. This sequence is very rare at natural termination sites in
E.coli
; from 3492 genes currently available for analysis in the TransTerm database (
24
) there is only one possible natural termination site with this sequence, a
putative unidentified open reading frame which terminates at a site overlapping
with the initiation site of deoxyribopyrimidine photolyase. We have assumed
this is a real termination site and have included it in our statistical
analysis. Even so, the occurrence of UGACUA is much lower than could be
anticipated from UGACUA frequencies in non-coding regions of the
E.coli
genome.
The frequency of occurrence of UGAN, UGACN and UGACUN at natural termination
sites was compared with the expected frequency of these sequences calculated
from the non-coding regions (with reference to the subgroup UGA for the UGAN series,
the subgroup of UGAC sequences for the UGACN series and the subgroup of UGACU
sequences for the UGACUN series). In this way the deviation between observed
and expected frequency at a particular position was independent of that at
previous positions. The expected tetra-, penta- and hexa-nucleotide frequencies were calculated by analysing the
regions of DNA spanning 100 bases 3' of the stop codons. This gave the expected frequencies of UGAN, UGACN
and UGACUN compared with the observed occurrences of the sequences at natural
termination sites.
The statistical analysis (Figs
1
and
2
) suggested that UGACUA is an unusual termination context. Experimentally,
earlier suppression studies had shown that while UGAC contexts might be good
for termination, there were exceptions to this general conclusion in cases
where the sequence was followed by UG or UU (
20
). Is the competition between frameshifting and termination at the RF-2 frameshift site regulated by a particularly poor UGA termination
context?
We previously examined the strength of each of the three termination codons with
all contexts in the 4th position using a pMALtm reporter construct and expression system with the RF-2 frameshift window cloned in-frame with the
malE
gene (
14
). An oligonucleotide spanning the RF-2 frameshift window, which contains a Shine-Dalgarno sequence (overlined), a `slippery' run of Ts and leucine
codon (underlined) and the TGA stop codon, is illustrated in Figure
3
A. Oligonucleotides containing redundancies in the +4, +5 and +6 positions
following the TGA were synthesised. These oligonucleotides were used to
generate a series of constructs that contained the stop codon contexts; TGANTA,
TGACNA and TGACTN (Figure
3
B).
In vivo
transcription and translation of plasmids containing the RF-2 frameshift window produces two fusion proteins; a 44 kDa product when
synthesis stops at the termination signal in the frameshift site and a 53 kDa
frameshift product when a +1 frameshift event occurs and translation is halted
further downstream. The constructs were expressed in FJU112, a wild-type strain of
E.coli
which has normal ribosomes and no suppressor tRNAs. Independent isolates of the
variant clones were analysed for termination signal strength in three separate
experiments. The wild-type UGACUA clone is common to all three series, and therefore the results
from nine experiments have been combined for this clone.
Figure
A stronger termination signal results in more termination product compared with
the amount of frameshift product. The converse is true if the termination
signal is weaker. Therefore, the effects on termination signal strength of each
of the nucleotides in positions +4, +5 and +6 following the stop signal can be
measured from the proportions of the two products.
Figure
4
shows the termination efficiencies of the three series of clones, UGA
N
UA, UGAC
N
A and UGACU
N
. Confirming the results found previously, there is a hierarchy of termination
signal strengths dependent upon the 4th nucleotide with a 7-fold range in their competitiveness with frameshifting in the order, UGA
U
UA > UGA
G
UA > UGA
A
UA > UGA
C
UA. Altering the nucleotide at position +5 had no significant effect upon the
poor termination signal strength resulting from the 4th position C. In
contrast, the nucleotide present at the 6th position increased termination
strength by 2-3-fold. The natural sequence has A in this position (open bar) and
this sequence was the weakest termination signal of the series, that is for the
UGACUN contexts, termination efficiency varied with N = A < G [approx] C [approx] U. With C, G or U in position +6 the termination signal strength was
raised to a level similar to that observed for UGA
A
UA.
Figure
The occurrences of nucleotides at the 4th and 6th positions, following UGA and
UGACU respectively, were non-random (Fig.
1
). While the particular stop signal context of the RF-2 frameshift site (UGACUA) is rare at natural termination signals, signals
with a nucleotide other than C in the +4 position are more common. All 5th base
contexts of the type UGACNA and 6th base contexts of the type UGACUN are
relatively infrequent at natural termination sites (Table
1
), presumably as a consequence of the rarity of UGAC termination signals (
14
).
Table 1
We used linear regression analysis to display how the relative termination
efficiencies in the UGANUA, UGACNA and UGACUN contexts correlated with their
use (Fig.
5
). The occurrence of each six-base signal with these contexts was calculated as a percentage of all stop
codons as in Table
1
, from the 3433
E.coli
genes listed in the TransTerm database for which all six nucleotide positions
were identified (
24
).
Figure
At positions +4 and +6 of termination signals the usage bias relates to
termination efficiency, with correlation coefficients of
r
[approx] 0.88 for UGANUA and
r
[approx] 0.84 for UGACUN contexts. The usage bias at position +5 correlates less
well with the termination efficiency,
r
[approx] 0.75. This relationship at positions +4 and +6, suggests that the role of
positions +4 and +6 in UGA-containing stop signals for RF-2-mediated termination of protein synthesis might be important.
Any variations in the use of nucleotides in the +5 position may have been
influenced by factors independent of the termination mechanism.
Recoding sites often contain in-frame stop codons, and therefore `recoding' represents a failure of that
stop codon to specify efficient termination of protein synthesis at the site.
Although termination is usually by far the predominant event, the RF-2 frameshift site is an exception since frameshift and termination events
occur with comparable frequency. The stimulators of frameshifting, in this case
a Shine-Dalgarno-like sequence followed by a high density of Us immediately before
the frameshift point (
27
), either could override a normal stop codon, or the stop signal may be partly
responsible for its own failure. A ribosomal pause at the stop codon within the
frameshift site is thought to be critical for the high efficiency of
frameshifting (
28
).
The duration of the ribosomal pause at the stop codon might be a key contributor
to efficient frameshifting. There is considerable bias in the position
following stop codons at natural termination sites (
12
,
13
), and a critical determinant of the pause might be how the nucleotide in this
position affects the rate of decoding of the stop signal. There was a 7-fold difference in the +4 nucleotide's ability to influence the
competitiveness of the UGA as a stop codon against the +1 frameshifting event
(Fig.
4
). In the original study we calculated relative rates of RF-2 selection (
29
) at the various stop codon contexts. UGAC, the stop signal found at the RF-2 frameshift site, was the slowest of all the possible UGAN, UAAN or UAGN
signals to be decoded, selecting RF at a rate some 50-fold less than UAAU, the most rapidly decoded signal, and ~30-fold less than the UGA-containing signal, UGAU (
14
).
Other site-specific context features of the RF-2 frameshift site may contribute to the length of the pause at the
stop codon and influence conclusions of stop signal strength. UGAC might not be
such a poor termination signal generally, and indeed suppression studies had
indirectly suggested that UGAC contexts might be good termination sites (
20
). While it was not possible to study the influence of the nucleotides upstream
of the stop codon at the RF-2 frameshift site because of the effect on the other stimulators of the
event (Shine-Dalgarno, spacing to frameshift site, homopolymeric run or the frameshift
codon itself), further analysis of the downstream region was possible.
The current study has focussed on the +5 and +6 downstream nucleotides at the RF-2 frameshift site. The statistical analysis of the +5 position suggested
that where UGAC is found at natural sites there is no selection against a
particular nucleotide in the next position, whereas there is significant
selection against A in the +6 position following UGACU (Fig.
1
). There is only one UGACUA sequence out of 30 UGACUN sequences at natural sites
in the pool of 3433 genes examined, whereas 10 UGACUA sequences would be
expected statistically (Table
1
). It is interesting that A is usually the favoured nucleotide in the +6
position of a termination signal (Fig.
2
).
The experimental study reinforced predictions from these theoretical analyses;
the UGAC signal was equally weak whatever the nucleotide in the +5 position,
but any of the three nucleotides U, G or C in the +6 position significantly
strengthened the competitiveness of the termination signal (Fig.
4
). There was a correlation between the frequency of occurrence of UGANUA and UGACUN at natural sites and their experimentally-determined strengths (Fig.
5
). In contrast, while U was under-represented in the +5 position for UGACNA signals, there was no
correlation between this occurrence and experimentally-determined signal strength.
How might the +4 and +6 nucleotides have such an effect on the rate of selection
of release factor at the stop codon? We have shown by site-directed crosslinking that the RF is in close physical contact with the
codon during recognition (
30
). The release factor protein in this position can presumably make contact with
nucleotides outside the primary recognition determinant, the codon itself. The
statistical bias, particularly in the +4 position, to a lesser extent in the +6
to +10 positions and in addition the -2 position (
6
), may be an indicator that there are favoured contexts for effective RF
recognition of stop signals extending on both sides of the codon. It is likely
that the determinants for RF recognition of a particular sequence could vary
from one context to another. For example, the +6 nucleotide may be important
for UGACUN contexts, as shown in the current study, but have less effect in
other contexts.
The RF-2 frameshift site has been conserved among prokaryotic organisms in five
of the six sequences reported so far, and the stop codon UGA and +4 nucleotide
C have been maintained in each case [the first example of an RF-2 gene which lacked the frameshift site was found in
Streptomyces coelicolor
(
31
)]. The conserved +4 C is at the third base position of the first codon, GAC,
after frameshifting and might have been expected to vary at least to a U
without penalty if protein sequence alone was the determining factor. There is
less conservation in the +5 and the +6 positions (two sequences have A and
three have U in the +6 position). It is interesting that the +8 and +9
positions are also conserved in all sequences although some of this
conservation might reflect a requirement to maintain protein sequence of the
gene product (positions 1 and 2 of an amino acid codon).
The termination signal within the RF-2 frameshift site is an integral part of the regulatory event aimed at
controlling the completion of the synthesis of RF-2 molecules, and it is particularly weak because of the +4 C and, to a
lesser extent, the +6 A which keep the rate of RF selection at the signal slow
enough for high efficiency frameshifting.
Thanks to John Atkins, Utah for helpful discussions. W.P.T. is an International
Scholar of the Howard Hughes Medical Institute and work described here has been
supported by grants from the Health Research Council of New Zealand, the
Lotteries Board (Health), and the Human Frontier Science Program (awarded to
W.P.T. and Yoshikazu Nakamura).


Sequence
Expected
Observed
% Total
UGA
U
UA
34
41
1.19
UGA
G
UA
16
19
0.55
UGA
A
UA
35
23
0.67
UGA
C
UA
10
1
a
0.03
UGAC
U
A
10
1
a
0.03
UGAC
G
A
21
10
0.29
UGAC
A
A
30
17
0.50
UGAC
C
A
13
9
0.26
UGACU
U
15
9
0.26
UGACU
G
19
12
0.35
UGACU
A
10
1
a
0.03
UGACU
C
12
8
0.23

REFERENCES
Return
