Increasing the distance between the snRNA promoter and the 3
'
box decreases the efficiency of snRNA 3
'-end formation
Increasing the distance between the snRNA promoter and the 3 ' box decreases the efficiency of snRNA 3 '-end formation
Lakshman
Ramamurthy
1,3
,
Thomas C.
Ingledue
2
,
Duane R.
Pilch
1
,
Brian K.
Kay
2,3
and
William F.
Marzluff
1,2,3,
*
1
Program in Molecular Biology and Biotechnology,
2
Curriculum in Genetics and Molecular Biology and
3
Department of Biology, University of North Carolina,
Chapel Hill
, NC 27599,
USA
Received July 22, 1996;
Revised and Accepted September 25, 1996
ABSTRACT
Chimeric genes which contained the mouse U1b snRNA promoter, portions of the
histone H2a or globin coding regions and the U1b 3
'
-end followed by a histone 3
'
-end were constructed. The distance between the U1 promoter and the U1 3
'
box was varied between 146 and 670 nt. The chimeric genes were introduced into
CHO cells by stable transfection or into
Xenopus
oocytes by microinjection. The efficiency of utilization of the U1 3
'
box, as measured by the relative amounts of transcripts that ended at the U1 3
'
box and the histone 3
'
-end, was dependent on the distance between the promoter and 3
'
-end box. U1 3
'
-ends were formed with >90% efficiency on transcripts shorter than 200 nt,
with 50-70% efficiency on transcripts of 280-400 nt and with only 10-20% efficiency on transcripts >500 nt. Essentially
identical results were obtained after stable transfection of CHO cells or after
injecting the genes into
Xenopus
oocytes. The distance between the U1 promoter and the U1 3
'
box must be <280 nt for efficient transcription termination at the U1 3
' box, regardless of the sequence transcribed.
INTRODUCTION
A novel feature of the biosynthesis of vertebrate snRNAs is that transcription
must initiate at an snRNA promoter for 3'-end formation (
1
,
2
). There is a single required sequence element for 3'-end formation, the 3' box, located ~10 nt downstream of the end of the primary transcript (
3
) and there is no requirement for sequences in the mature snRNA for proper 3'-end formation (
1
). However, when a heterologous promoter [e.g. thymidine kinase (
2
), globin, adenovirus (
1
) or histone (
4
)] is used in place of the snRNA promoter, the snRNA 3'-end is not formed; rather, transcription continues past the normal
3'-end and the transcripts are polyadenylated using cryptic or natural
polyadenylation sites (
1
,
2
). In addition, longer read-through transcripts formed in isolated nuclei are not precursors of mature
U1 RNA molecules (
5
) and longer `precursors' are not converted to mature snRNAs when they are
injected into
Xenopus
oocytes (
6
). These results suggest that a sequence in the snRNA promoter is required for
proper 3'-end formation and that 3'-end formation occurs co-transcriptionally, either as a termination event
or as a very rapid processing event. Following formation of the primary transcript, the mature
snRNA is formed by removing nucleotides (<15) from the 3'-end, presumably by an exonuclease(s) in the cytoplasm (
5
,
7
), while the removal of the last 2 nt takes place in the nucleus (
8
).
We report here that there is a strong distance dependence for the coupling of
initiation from vertebrate snRNA promoters with formation of snRNA 3'-ends. By varying the amounts of histone or [alpha]-globin sequence between the promoter and the 3' box, we have constructed genes that encode
transcripts ranging in length from 146 to 670 nt, ending at the snRNA 3'-end. We find that the snRNA end is formed inefficiently, as
detected by the preferential formation of the distal histone 3'-end, if the transcript is longer than 500 nt. If the distance
between the transcription start site and the U1 signal is <200 nt, then the U1 3'-ends are formed very efficiently. Similar results were obtained
both in mammalian cells and in
Xenopus
oocytes.
MATERIALS
AND
METHODS
Construction of cloned genes
The chimeric U1 and histone H2a genes were constructed from the mouse histone
H2a-614 gene (
9
,
10
) and the mouse U1b.1 and U1b.2 genes (
11
,
12
); these genes are shown in Figure
1
. The mouse U1b promoter containing 5 nt of U1 coding sequence has been
described previously, as have the cassettes containing the U1 3' signal with either 10 or 50 nt of U1 coding sequence (
4
) and the histone H2a-614 3'-end signal (
13
). The genes are named by the length of the snRNA transcript they produce and
whether they have 49 (UL genes) or 10 nt (US genes) of U1 RNA sequence at their
3'-end.
Transfection
The genes were introduced into CHO cells by co-transfection with the pSVneo gene using the polybrene procedure (
15
). Stable transfectants were isolated by selection with G418 as previously described (
16
). Pools of transfectants, 20-50 per flask, were pooled and grown in the absence of G418 for analysis of expression of the transfected genes.
Preparation and analysis of RNA
RNA was prepared from exponentially growing cells (<50% confluent) as previously described (
16
). The 3'-ends of the transcripts from the transfected genes were analyzed by S1 nuclease
mapping using the probes described in the figure legends. The 5'-ends were mapped using probes labeled at an appropriate internal site in the gene.
The probe used in Figure
6
D was made using PCR to amplify a 152 nt fragment of DNA containing the sequence
from -13 to +127 of the U1-histone hybrid genes plus an additional 12 nt of non-homologous sequence included at the end of the 5' primer. This fragment was labeled with
polynucleotide kinase and [[gamma]-
32
P]ATP and used in an S1 nuclease protection assay.
The conditions of hybridization and digestion have been described (
9
). The protected fragments were resolved by electrophoresis on 6% polyacrylamide-7 M urea gels, detected by autoradiography and quantified by densitometry or on a PhosphorImager (Molecular Dynamics).
Injection of
Xenopus
oocytes
Supercoiled DNA (15 nl, 30 [mu]g/ml) was injected into stage VI
Xenopus
oocytes (
17
) and the oocytes were incubated at 18oC for 18 h. In some experiments (noted in the figure legends) the amount of
DNA injected was varied. RNA was prepared as previously described (
18
) and analyzed by S1 nuclease mapping as described above.
RESULTS
In the course of constructing genes which would express histone mRNAs ending in
a U1 snRNA 3'-end, we observed that the U1 3'-end was formed inefficiently (
14
). To study some of the possible parameters affecting snRNA 3'-end formation in these chimeric genes, we constructed genes of
varying lengths with a U1 promoter, either histone or globin coding sequences
and a U1 3'-end followed by an efficient histone 3' processing signal located ~100 nt 3' of the U1 3' box. Any transcripts which do not
terminate at the U1 3' box should be processed at the histone processing signal.
Figure
1
shows the genes used in these experiments. All the genes had a 226 nt mouse U1b
promoter and the first 5 nt of the U1 RNA sequence. This cassette was fused to
portions of either the mouse histone H2a-614 gene (UL and US genes; Fig.
1
A and B) or the human [alpha]-globin coding region (UG genes; Fig.
1
C). These coding regions were followed by a U1 3'-end, with either 49 (UL genes), 32 (UL
172
and UG genes) or 10 nt (US genes) of U1 coding sequence followed by the U1 3' box. The histone H2a-614 3'-end and processing signal were placed ~100 nt 3' of the U1 3' box. These genes were named
according to the sequence in the gene, the amount of U1 coding sequence present
and the length of the transcript ending at the U1 3'-end (indicated by the subscript). The UL genes contain the terminal
stem-loop in the U1 RNA, while the US genes contain only the last 10 nt of U1
sequence and no secondary structures present in the U1 snRNA. The UL and US
genes contain histone sequences while the UG genes contain globin sequences
(Fig.
1
C). For example, the UL
580
gene encodes a 580 nt transcript ending with the last 49 nt of U1 snRNA and the
US
310
gene encodes a 310 nt transcript ending with the last 10 nt of U1 snRNA. Figure
1
D shows the sequences at the 3'-ends of the genes.
In all these genes there are two functional 3' signals such that there are two distinct transcripts produced from these
genes, one ending at the U1 3'-end and the other at the histone 3'-end. The different 3'-ends were distinguished by S1 nuclease
mapping and the distinct protected products resulting from transcripts ending
at the histone or U1 3'-ends were quantified by densitometry or on a PhosphorImager. The
relative amounts of transcripts with either U1 or histone 3'-ends gives a measure of the efficiency of U1 3'-end formation, assuming that the transcripts have
similar stabilities. We assume that all of the transcripts which extend past
the U1 3'-end are processed at the distal histone 3'-end. This is a good assumption, since the histone H2a-614 3' processing signal is very efficient (
13
) and has been previously shown to be utilized efficiently on transcripts which
initiate at the U1 promoter (
4
). The chimeric genes were introduced into CHO cells and pools of stable
transformants assayed to determine the proportion of steady-state transcripts ending at the histone or U1 3'-ends. The genes were also injected into
Xenopus
oocytes to test expression in another cell type and to address the possibility that measurements of the
relative amounts of transcripts with different 3'-ends in steady-state RNA in mammalian cells may not reflect the relative
efficiency of 3'-end formation. Transcripts in
Xenopus
oocytes are generally stable and it has been possible to detect transcripts in oocytes which are often undetectable in somatic cells, such as the prematurely terminated
myc
(
19
,
20
) or [alpha]-tubulin transcripts (
21
).
snRNA 3
'
-ends are formed efficiently on short transcripts
The UL
190
and US
146
genes each encode transcripts ending at the U1 3'-end which are in the same size range as snRNAs. There are two
discrete fragments protected by the probe which are derived from the
transfected genes, the shorter one ending at the U1 3'-end and the longer one ending at the histone 3'-end. At least 98% of the transcripts from the UL
190
gene transfected into CHO cells ended at the U1 3'-end (Fig.
2
A, lane 2). Transcripts ending at the histone 3'-end were barely detectable only in long autoradiographic exposures.
snRNA 3
'
-ends are formed with moderate efficiency on transcripts of 280
-
400 nt
A series of genes yielding snRNA transcripts containing histone coding region
and 280-400 nt in length were constructed, ending with either the last 49 nt of
U1 snRNA (Fig.
1
A) or the last 10 nt of U1 RNA (Fig.
1
B). When we assayed the transcripts from these four genes in either CHO cells or in
Xenopus
oocytes, we observed that only 50-70% of the transcripts ended at the U1 3'-end (Fig.
3
A).
snRNA 3
'
-ends are formed inefficiently on long transcripts
Previously we showed that the UHU
L
and UHU
S
genes, which have a U1 promoter and 3'-end and a complete histone coding region, but do not have a histone
3'-end downstream, formed a small number of transcripts which ended at
the U1 3'-end (
4
). Since these genes did not have a histone 3' processing site downstream, it was not possible to determine the
efficiency of U1 3'-end formation. To determine the efficiency of U1 3'-end formation on these long transcripts and to
determine whether the presence of the distal histone 3' processing signal affected usage of the U1 snRNA 3'-end, we compared expression of the UHU genes with the UL
580
and US
536
genes.
Efficiency of snRNA 3
'
end formation is sequence independent
To rule out an effect of histone coding sequences, a series of genes encoding
short (UG
151
), intermediate (UG
350
and UG
386
) and long (UG
586
and UG
670
) transcripts ending at the U1 3'-end were constructed (Fig.
1
C). The UG
151
, UG
350
and UG
586
genes have no sequences in the transcribed region in common with the UL and US
genes, other than the 3'-end and the first 5 nt of the U1 coding region. These genes were
injected into
Xenopus
oocytes and the proportions of the transcripts ending at the U1 and histone 3'-ends were measured. The results were similar to those obtained for
the UL genes. Over 80% of the transcripts from the UG
151
gene ended at the U1 3'-end (Fig.
5
A, lane 1), ~50% of the transcripts from the UG
386
gene ended at the U1 3'-end and 50% at the histone 3'-end (Fig.
5
A, lanes 2 and 3) and only 15% of the transcripts from the UG
586
gene ended at the U1 3'-end.
Almost all the transcripts initiate at the U1 5'-end
One possible explanation for these results could be that the transcripts which
end at the histone 3'-end are not initiated from the U1 snRNA promoter, but from some
cryptic promoter. Transcripts initiating at a cryptic promoter upstream of the U1 start site do
not direct U1 3'-end formation (
4
). To rule out the possibility that formation of histone 3'-ends was a result of initiation from a cryptic promoter, we have
mapped the 5'-ends of the transcripts from all of the genes. Figure
6
A shows the 5'-ends of the transcripts from the UL
355
and US
310
genes, mapped using a probe labeled just prior to the spot where the U1 3'-end was attached. A single protected fragment of 300 nt, the expected
length for transcripts initiating at the U1 5'-end, was observed with both of these genes (Fig.
6
A, lanes 2-5). Figure
6
B shows the 5'-ends of the transcripts from the other genes. The transcripts were mapped using probes labeled at the 5'-end of the
Ava
I site at codon 20 of the H2a-614 coding region in the UL
391-cod
gene. The great majority of transcripts from these genes initiated at the U1
snRNA start site (Fig.
6
B, lanes 1, 4, 5, 7 and 8). A small amount of transcripts (labeled U1') was detected which initiated upstream of the U1 promoter (Fig.
6
B, lanes 4 and 5; fragment U1'). No upstream starts were detected from the US
146
or UL
281
genes (Fig.
6
B, lanes 1, 7 and 8).
Figure
6
C shows the analysis of the 5'-ends of the transcripts from the UHU
L
, UHU
S
, US
536
and UL
580
genes, using an S1 nuclease assay with a probe which is labeled at the 5'-end of the
Nar
I site (codon 45) of the histone H2a gene. More than 95% of the transcripts map
to the U1 start site. There is only a small amount of transcripts which
initiate ~200 nt 5' of the U1 start site (labeled U1' in Fig.
6
C).
The assays in Figure
6
A-C rule out the presence of large amounts of transcripts initiating at a
defined site 5' of the gene. However, they do not rule out the possibility of a
heterogeneous set of transcripts initiating upstream of the U1 promoter. These
would not have been detected in the previous assay, since they would not map a
defined protected fragment. To assess the amount of transcripts which initiated upstream of the U1 promoter, we constructed a probe which contained 12 nt of heterologous sequence, 13 nt of the U1 promoter and 127 nt of coding region. This probe will map all the transcripts which come from upstream of the U1 start site as a single fragment 12 nt longer
than the properly initiated fragment. The ratio of the two protected fragments
gives the relative amount of properly initiated transcripts. The fragments were quantified on a PhosphorImager. Ninety five percent of the transcripts from the UL
580
gene initiated at the U1 start site (Fig.
6
D, lane 5). Since the great majority of the transcripts from the UL
580
gene ended at the histone 3'-end, most of the transcripts which ended at the histone 3'-end must have initiated at the U1 start site.
Similarly, >85% of the transcripts from the UL
391-cod
gene were initiated correctly (Fig.
6
D, lanes 3 and 4), in agreement with the results in Figure
6
B (lanes 4 and 5)
.
Taken together these results indicate that there were not significant amounts
of improperly initiated transcripts which contributed to these results.
DISCUSSION
The mechanism of formation of 3'-ends of the U series of snRNAs in vertebrates is unique among genes transcribed by RNA polymerase II. First, there is only a single sequence element which
lies 3' of the snRNA sequence required for 3'-end formation (
3
,
22
). This is in contrast to the bipartite elements required for 3'-end formation found in both histone and polyadenylated mRNAs, which
define a cleavage site located between them (
23
). Second, snRNA 3'-end formation in vertebrates is tightly coupled to transcription
initiation and there is an absolute requirement for initiation from an snRNA promoter (
1
,
2
,
24
,
25
). The initial transcript from the snRNA genes is formed by transcription termination.
We have previously shown that the histone H2a sequence does not contain any
cryptic U1 3' box signals (
4
) and that the U1 promoter can efficiently drive expression of the histone mRNA and that the histone 3'-end is formed efficiently on these transcripts (
4
). Previously we showed that the U1 promoter expresses the [alpha]-globin protein efficiently in CHO cells (
26
). In the results reported here, we define a length requirement for efficient
coupling of the U1 promoter to U1 3'-end formation. Since the human [alpha]-globin coding sequence and the histone H2a sequence are totally dissimilar, this phenomenon is independent
of the sequences transcribed and dependent solely on the length of the transcribed region. These
results depend on the transcripts initiating at the U1 promoter, since
transcripts which read through these genes which initiated elsewhere would all
end at the histone 3'-end. The great majority of all the transcripts initiated at the
proper U1 start site. Similar results have been seen with genes expressing [beta]-galactosidase from these promoters (
26
).
The relative usage of the U1 and histone 3'-ends as a function of the length of the U1 transcripts in both CHO
cells and
Xenopus
oocytes is summarized in Figure
7
. U1 3'-ends are formed very efficiently on transcripts <200 nt, the size range of the vertebrate snRNAs. U1 3'-ends are formed with reduced efficiency (50-70%) on transcripts between 280 and 400 nt and are
formed very inefficiently on transcripts >500 nt.
Mechanism of coupling the promoter to 3
'
-end formation
Three features of 3'-end formation of snRNAs in vertebrates are a strict dependence on
initiation from an snRNA promoter (
1
,
2
,
24
,
25
), a dependence on the length of the transcribed region (this paper) and that
the 3'-end signal is not an essential part of the promoter, since absence
of the 3' signal does not affect the level of expression from the U1 promoter (
4
). The mechanism coupling transcription initiation to 3'-end formation of the snRNA genes presumably evolved to allow
efficient expression of these small transcripts. We note that the yeast
Saccharomyces cerevisiae
snRNAs, which are often much longer than vertebrate snRNAs, ranging in size
from 1175 to 106 nt (
27
,
28
), are transcribed from polymerase II promoters similar to mRNA promoters and
are transcribed efficiently from mRNA promoters (
27
,
29
). Thus it is likely that 3'-end formation of yeast snRNAs is not dependent on the length of the
transcribed region. The absolute requirement for transcription from an snRNA
promoter to form snRNA 3'-ends may be unique to vertebrates. There is not a strong coupling
of the snRNA promoter to formation of the 3'-ends of invertebrate (sea urchin) snRNAs (
30
) or of plant snRNAs (
31
), although these RNAs are also the size of the vertebrate snRNAs. The 3'-end of sea urchin snRNAs, like the 3'-end of vertebrate snRNAs, is formed co-transcriptionally (
30
) and hence there is likely to be a similar length dependence for 3'-end formation of these RNAs.
There are two possible mechanisms for coupling transcription termination with
the promoter. First, a factor could bind the 3'-end signal in the DNA and then associate with the transcription initiation complex, presumably recognizing an essential component of the complex unique to snRNA genes (e.g. the factor which binds the proximal sequence element). Alternatively, a termination factor, which can specifically recognize the 3' signal, could associate with the transcription complex
during initiation, remain with the polymerase during elongation and stimulate
termination when the 3' signal is reached. The association between the factor and the
transcription complex may be weak and the termination factor might dissociate
(or be displaced) from the transcription complex on a long transcript before
the 3'-end signal (either as RNA or DNA) is reached. Either of these
mechanisms would be consistent with the requirement of the U1 snRNA promoter
for 3'-end formation.
Figure 7
.
Length dependence of formation of UI 3 ends. The percentage of transcripts
ending at the UI 3-end is plotted as a function of the lengths of the transcripts ending at
the UI end. The data include analysis of the genes in both mammalian cells and
Xenopus oocytes. The data were obtained by densitometry of the autoradiographs
or from analysis on a PhosphorImager. The squares are the results for the UL
and US genes in CHO cells, the diamonds the results for the UL and US genes in
Xenopus oocytes and the circles the data for the UG genes in Xenopus oocytes.
Recently Price and co-workers have shown that there is a transition during
in vitro
transcription which converts the transcription complex from one that pauses
and/or terminates readily into a complex that is highly processive and
resistant to many pause sites (
32
,
33
). It is possible that the length dependence of U1 snRNA 3'-end formation is a result of the transcription complex undergoing a
transition to a stable, `committed' state, refractory to termination. Prior to reaching this length, transcription can terminate readily at the U1 3'-end signal. While this explanation could account for the length
dependence, it fails to account for the coupling of 3'-end formation to the U1 promoter. The snRNA promoter may promote
the initial assembly of a transcription complex which is particularly prone to
terminating at the snRNA 3' box, while transcription complexes assembled on other promoters read
through the 3' box sequences readily.
We have observed inefficient U1 3'-end formation on U1 transcripts synthesized in isolated nuclei from
mouse myeloma cells consistent with the possibility that the termination factor
is easily lost during cell fractionation (
34
). Only if nuclei are prepared in such a way as to minimize loss of nuclear
components (
35
) is there efficient coupling of transcription and 3'-end formation observed in transcripts synthesized in isolated
nuclei. Taken together these results suggest that there is a
trans
-acting factor required for snRNA 3'-end formation associated with the transcription complex,
which is readily lost from the complex. The biochemical basis of snRNA 3'-end formation remains to be elucidated.
ACKNOWLEDGEMENTS
This work was supported by NIH grant GM27789 to W.F.M and a grant from the
Muscular Dystrophy Association to B.K.K. T.C.I. was supported by NIH Training
Grant GM07092 to the Curriculum in Genetics and Molecular Biology.
REFERENCES
1 Hernandez,N. and Weiner,A.M. (1986) Cell, 47, 249-258.MEDLINE Abstract
2 Neuman de Vegvar,H.E., Lund,E. and Dahlberg,J.E. (1986) Cell, 47, 259-266.
26 Bartlett,J.S., Sethna,M., Ramamurthy,L., Gowen,S.A., Samulski,R.J. and Marzluff,W.F. (1996) Proc. Natl. Acad. Sci. USA, 93,8852-8857.MEDLINE Abstract
27 Guthrie,C. (1988) In Birnstiel,M.L. (ed.), Structure and Function of Major and Minor Small Ribonucleoprotein Particles. Springer-Verlag, Berlin, Germany, pp. 196-212.