ABSTRACT
We have identified cDNAs encoding three related forms of transcription
elongation factor TFIIS (S-II) in
Xenopus laevis
ovary. Comparison of
Xenopus
and mammalian sequences identifies likely diagnostic amino acids that
distinguish classes of vertebrate TFIIS. The diversity of TFIIS polypeptides in
Xenopus
is due partly to the presence of two diverged genes in this tetraploid genome.
We isolated genomic clones containing one of the genes, xTFIIS.oA, and, unlike
a previously described vertebrate TFIIS gene, found that it contains introns.
Alternative splicing at a CAG/CAG motif containing the 3
'
splice site of intron 4 produces the third form of xTFIIS, which differs from
one of the others simply in lacking Ser109. Intron 6 of xTFIIS.oA contains
splice and branch site consensus sequences conforming to those of the minor
class of AT-AC introns and this was confirmed for the homeologous xTFIIS.oB gene by
genomic PCR. Other unusual but functional variants of RNA processing signals
were found in xTFIIS genes at the 5'
splice site of intron 8 and the polyadenylation hexanucleotides. Utilization of
multiple unusual processing signals may make the generation of mature xTFIIS.o
mRNAs inefficient and the possible regulatory consequences of this are
discussed.
Control of the elongation phase of transcription has emerged recently as an
important mechanism of transcriptional regulation in eukaryotes (reviewed in
1
,
2
). The molecular mechanism(s) responsible for controlling transcriptional
elongation and the basis of its modulation are being studied from a variety of
standpoints (reviewed in
3
). First, it appears that events occurring at the promoter are important in
controlling the ability of RNA polymerase II to transcribe beyond the 5' region of the transcription unit. For several genes the elongation
properties of transcription complexes appear to be correlated with the presence/absence of
particular transcription activators or
cis
-acting promoter elements that are thought to exert their effects by directing the
assembly of transcription complexes with distinctive elongation competencies
(reviewed in
4
,
5
). A second focus is the nature of the intragenic and downstream signals that
cause pausing, arrest and termination of transcription and the manner in which
they interact with the elongation complex (reviewed in
2
,
6
). Finally, the properties and constitution of the elongation complex itself
have become a target of intensive investigation (reviewed in
7
); on the one hand these studies have led to a reassessment of the basic mechanics of RNA synthesis by core subunits of prokaryotic and eukaryotic RNA polymerases (
8
,
9
), while on the other they have revealed the existence of a variety of protein
factors that affect the activity of the RNA polymerase II elongation complex
(reviewed in
7
).
Some RNA polymerase II elongation factors, such as
Drosophila
factor 5 and its mammalian and yeast equivalents, TFIIF and RAP 30/74 (
10
-
12
), affect both initiation and elongation. Others, such as TFIIS/S-II (
13
,
14
), the elongin/SIII complex (
15
), TFIIX (
16
) and the yeast protein YES (
17
), affect only elongation. In addition to these ubiquitously expressed, general
elongation factors, another category of gene-specific elongation factors may be represented by the HIV Tat protein,
which acts to stimulate the production of full-length transcripts specifically from the HIV promoter (reviewed in
18
). Mechanistically, elongation factors can act either like TFIIF and SIII, by
increasing the overall catalytic rate of elongation, or like TFIIS, by
facilitating elongation through sites of transcriptional arrest. The most
detailed information on the mode of action of an elongation factor comes from
in vitro
studies of TFIIS (reviewed in
15
,
19
). Purified TFIIS binds to RNA polymerase II and stimulates it to read through a
variety of transcriptional blocks by activating an endogenous RNA endonuclease
activity in the polymerase core. It is thought that cleavage of the 3'-end of the nascent transcript, together perhaps with other TFIIS-induced changes in the polymerase (
20
), permits the arrested elongation complex to recommence transcription upstream
of the block and thence to make fresh attempts to read through the arrest site.
Despite the detailed understanding of TFIIS activity
in vitro
, as with other elongation factors its mechanism of action and its precise role
in vivo
are not clear. For instance, there are as yet no examples where TFIIS-responsive arrest sites have been implicated in regulating gene
expression, either in general or of particular genes or gene families. However,
recent evidence for the existence of tissue-specific forms of TFIIS suggests one means of addressing this issue. TFIIS
and cDNAs that encode it have been isolated from yeast (
21
),
Drosophila
(
22
) and several mammals (
23
,
24
). Among the latter, cDNAs encoding diverse TFIIS polypeptides have been
recognized and this is partly explained by tissue-specific expression of distinctive TFIIS isoforms. Thus a well-conserved, `general' form of TFIIS has been identified in mouse (
25
) and human cell lines (
23
) as well as human kidney (
26
), whereas human testis and ovary express a form of TFIIS that is highly
diverged from the general form (
27
) but very similar to a TFIIS identified in rat testis (
24
). Furthermore, cDNA clones isolated from mouse liver predict two different
TFIIS polypeptides that are highly diverged from the general form (
28
). Further limited diversity among the mouse (
28
) and human (
29
) general TFIIS isoform has been tentatively identified, but may arise from cDNA
cloning artefacts (
26
) or ill-defined alternative splicing events (
29
), notwithstanding that the only information regarding mammalian TFIIS genes to
date suggests that they lack introns (
26
). The existence of distinctive isoforms of TFIIS raises the prospect of their
having different properties and/or regulatory targets, but critical evaluation
of these possibilities and of the role elongation factors in general play in
regulating gene expression
in vivo
requires development of appropriate assay systems.
We have previously used the amenable experimental system offered by
Xenopus
oocytes to investigate an aspect of the control of elongation manifested in the
frequent premature termination of transcription on microinjected genes (
30
,
31
). We wished to investigate the effects that various elongation factors might
have on this process and also to develop an experimental system to allow wider
investigations of the structure-function relationships of these factors. As a first step we set out to
characterize the types of TFIIS in oocytes. We have found that at least three
related forms of TFIIS mRNA are present in the
Xenopus
ovary. The three encoded polypeptides most closely resemble the general TFIIS
isoform found in mammalian cultured cells and matches between amphibian and
mammalian sequences define isoform diagnostic amino acids that may have
functional significance. The diversity of sequences expressed in the ovary
results from the activity of two different TFIIS genes and from an alternative
splicing event that provides or denies a single serine. Elucidation of the
latter hinged upon the discovery that one of the TFIIS genes of
Xenopus
is organised into 10 exons and also led to the surprising finding that multiple
unusual RNA processing signals exist in this gene. We discuss the implications
for the regulation of TFIIS gene expression of the need to utilize rare
processing signals and the possibility that TFIIS genes may themselves contain
TFIIS-responsive arrest sites.
A probe corresponding to much of the C-terminal half of a human TFIIS coding region was generated by PCR of human
fibroblast cDNA. The cDNA was produced by reverse transcription of total RNA
from cultured fibroblasts using M-MuLV reverse transcriptase and the downstream PCR primer. PCR primers were
predicted from the sequence of a human general TFIIS isoform (
23
) with one or two changes to generate
Sac
I or
Sph
I sites respectively in the upstream primer (5'-ATGCTTGCTGGAGCTCTTCG-3') and downstream primer (5'-TGCCGCATGCGAACAAGTCA-3'). After PCR (30 cycles,
annealing step 50oC) and digestion with
Sac
I and
Sph
I, the ~350 bp fragment was gel purified and ligated into pUC18. After sequencing
to confirm its identity, the insert fragment was resected and gel purified
prior to labelling.
A similarly sized fragment from a slightly more C-terminal region of TFIIS was generated by RT-PCR of total RNA from
Xenopus
kidney. Reverse transcription was primed with random hexanucleotides and PCR
carried out with upstream primer 5'-AACACGGATATGAAGTACAA-3' and downstream primer 5'-TTACCGCACTCATTGCA-3'. The gel-isolated fragment was
cloned into the pGEM-T vector (Promega Ltd) and the insert resected using flanking vector sites
prior to labelling. A second
Xenopus
TFIIS probe was obtained by digesting cDNA clone pO2 (see below)with
Xba
I and
Xho
I to release the entire 1.3 kb cDNA insert. All probes were labelled with [[alpha]-
32
P]dCTP by random primed labelling.
The following libraries were utilized in this work: (i) a
X.laevis
cDNA library constructed by Dr J.Sommerville (University of St Andrews) in [lambda]ZAP (Stratagene Ltd) from RNA of stage I
ovary enriched for polysomes; (ii) a
X.laevis
ovary cDNA library constructed by Dr R.Harland (University of California,
Berkeley) in [lambda]gt10; (iii) a stage 12
X.laevis
embryo cDNA library constructed in [lambda]gt10 by Dr R.Harland; (iv) a
X.laevis
genomic library constructed in [lambda]EMBL4 by Dr E.Jonas (supplied by Dr T.Sargent, NIH) using DNA from red
blood cells of a homozygous diploid female. Libraries were hybridized with
random primed probes and washed at reduced stringency (65oC in 2* SSC) using standard techniques (
32
). Inserts were re-cloned into plasmid vectors, either via excision from [lambda]ZAP recombinants into pBluescript SK- (Stratagene) or via sub-cloning of complete or partial restriction fragments
from [lambda]gt10 recombinants into pGEM7Zf (Promega) or pT7T318U (Pharmacia).
The insert of cDNA clone pcr59 was produced by PCR (30 cycles, annealing at 55oC) from the [lambda]ZAP ovary library using as primers kp9 (5'-CCACCCGAATTGGAATGTC-3') and kp5 (5'-GCAATTCTGCTCCGGATTCTG-3') and as the
template an amount of phage suspension representing ~100 000 p.f.u. PCR products were digested with
Eco
RI and
Pvu
II (which should cut within PCR products of xTFIIS.oA but not xTFIIS.oB cDNAs)
and cloned into pGEM7Zf. Similarly, PCR amplification of purified [lambda] recombinants and whole libraries to ascertain the presence or absence
of a second
Eco
RI site in the xTFIIS.oB coding region was carried out using phage suspensions
as templates but utilizing primers kp4 (5'-GATCAACCAGCTCCTGCAC-3') and kp5. The extent of amplification was checked by
agarose gel electrophoresis of an aliquot of the reaction, the remainder being
purified by phenol/chloroform extraction and then digested with
Eco
RI. The digested PCR fragments were visualised on 2.5% agarose gels.
PCR amplification of genomic DNA utilized 50 ng
X.laevis
liver DNA and primers kp11 (5'-CCACATTGCCATTGGTGC-3') and kp5. After phenol/chloroform purification of PCR
products the fragments were ligated into pGEM-T to generate clone pcrX13.
Sequences were obtained from both strands of double-stranded plasmid templates using vector or internal primers. Sequencing was carried out either manually using a T7 polymerase sequencing kit (Pharmacia) or by the University of Nottingham Sequencing Laboratory
with an automated sequencing instrument (Applied Biosystems Inc.). Sequence
manipulation and the alignments shown utilized applications from the DNASTAR
(Chiswick, London) Lasergene package. DNA sequences have been deposited with the EMBL Nucleotide Sequence Database (accession nos X97658-X97666)
We screened
X.laevis
cDNA libraries using partial TFIIS probes derived by RT-PCR (see Materials and Methods). Two phage inserts derived from a [lambda]gt10 embryo library were recloned as plasmids pe1 and pe2 and two
from an ovary library constructed in [lambda]ZAP as po1 and po2 (Fig.
1
). (We shall not attach biological significance to the isolation of cDNAs from
libraries of two different developmental stages, since we cannot be sure that
some transcripts represented in this early embryo library are not of maternal
origin.) Both pe1 and po2 appeared large enough to encode complete TFIIS
polypeptides and were characterized in detail. The coding regions of po2
(accession no. X97666) and pe1 were almost identical, with the differences
including a single synonymous substitution and a single non-synonymous substitution that presumably reflect allelic variation between
the frogs giving rise to the two libraries. However, the final, more puzzling
difference was that the three nucleotides (CAG) at positions 443-445 of the po2 coding region were absent from pe1, generating a second
Eco
RI site in the latter. Targetted sequencing of pe2 showed that it resembled po2
in containing the CAG; the reason for the absence of the trinucleotide from pe1
is considered below. Conceptual translation of po2 and pe1 (Fig.
2
) revealed that they encode a polypeptide of 303/302 amino acids closely related
to the human and mouse general TFIIS isoforms. The mammalian and amphibian
polypeptides share ~80% amino acid identity.
We used the human TFIIS fragment and the entire po2 cDNA insert as probes to
screen a
X.laevis
genomic library. Two positives clones, [lambda]G1 and [lambda]G9, with different restriction patterns were studied in detail;
plasmid subclones derived from both were sequenced (accession nos X97658-X97665) using vector primers and internal primers predicted from the po2
cDNA and from the genomic sequence itself. Although [lambda]G1 and [lambda]G9 do not quite overlap, together they contain the two halves of
a single xTFIIS gene. The coordinates of the exon sequences needed to assemble
the complete coding region of 909 bp are collated in X97658. Unexpectedly, this
vertebrate TFIIS gene spans at least 13 kb as a result of the presence of nine
introns. An overview of the organization of the xTFIIS gene derived by complete
sequencing through introns or by locating exons on the restriction maps of the [lambda] inserts is presented in Figure
1
. Comparison of the xTFIIS coding sequences present in [lambda]G1 and [lambda]G9 with those of cDNA clone po2 showed 6.2 and 7.7% nucleotide
mismatch respectively and even greater divergence between the untranslated
regions. However, the C-terminal coding region contained in po1 matched exactly the sequence of
exons 7-10 present in [lambda]G1. We then isolated by PCR of the same ovary library a partial
cDNA designed to overlap [lambda]G1 and [lambda]G9 and to contain exons 4-7 of this gene (Fig.
1
). The sequence of clone pcr59 was identical to the corresponding exons of both
genomic clones, and we have called the ovary-expressed gene producing this cDNA and po1 xTFIIS.oA. Similarly, we refer
to the gene giving rise to the transcripts represented in po2, pe1 and pe2 as
xTFIIS.oB.
Xenopus laevis
is an ancient allotetraploid species (
33
) possessing two homeologous genomes that last shared a common ancestor ~30-40 million years ago (
34
). The overall nucleotide divergence in the coding regions within various
homeologous gene pairs ranges from 4.5 to 24.9% (
33
). In addition to falling within this range of overall divergence (see above),
strong evidence that xTFIIS.oA and xTFIIS.oB are homeologues comes from
considering just synonymous substitutions, which provides a less variable
molecular clock for these comparisons. It has been shown (
34
) that the number of synonymous nucleotide substitutions per synonymous site (
d
S
) calculated by the method of Nei and Gojobori (
35
) for 18
X.laevis
homeologous gene pairs has an overall mean of 0.192 +- 0.017 (range 0.118-0.328); in codon-by-codon comparisons of xTFIIS.oA and xTFIIS.oB (po2
sequence) the value obtained for
d
S
is 0.179. Non-synonymous codon substitutions result in 25 amino acid differences between
the two predicted xTFIIS.o polypeptides (Fig.
2
). Most of the variation occurs in the central third of the polypeptide, as in
cross-mammalian comparisons, and therefore corresponds mainly to changes in
exons 4-6. Both xTFIIS.oA and xTFIIS.oB are more closely related to the general
form of mammalian TFIIS than to the TFIIS genes specifically expressed in
mammalian testis. Indeed, the general mammalian TFIIS isoforms have a greater
amino acid similarity to xTFIIS.o than to the mammalian testis isoforms. As
shown in Figure
2
, the pattern of variation observed in comparisons of amphibian and mammalian
TFIIS sequences allows the identification of putative diagnostic residues that
distinguish these two major classes of vertebrate TFIIS. In particular, within
the highly conserved C-terminal third of TFIIS, it is especially clear that the sequence of a
region preceding the zinc-ribbon motif (residues 250-300;
36
) has been conserved in mammalian and amphibian general isoforms but is
distinctive in the testis isoforms (Fig.
2
).
Our discovery that in
Xenopus
TFIIS is encoded by a gene organized into multiple exons and the mapping of
exon/intron boundaries onto xTFIIS cDNAs (Fig.
2
) explains the absence from the pe1 cDNA of a CAG trinucleotide that is present
in po2, pe2 and pcr59 (see above). This triplet in fact represents the first
three nucleotides of exon 5 and it is preceded by another CAG triplet that
comprises the consensus 3' splice site of intron 4 (Fig.
3
a). Competition during step 2 of splicing between the two closely spaced AG
dinucleotides of the CAG/CAG motif at the intron 4/exon 5 boundary will result
in two alternatively spliced products. Although in a less favourable position and non-consensus sequence context (i.e. preceded by two purines), occasional use of this
unusual downstream splice site accounts for the specific omission of CAG from
the transcript represented by pe1.
Figure
We wanted to check that the downstream splice was not simply an event unique to
pe1 but that it occurs in other transcripts. To do this we made use of the fact
that absence of the CAG triplet generates a second
Eco
RI restriction site in xTFIIS.oB cDNAs (Fig.
3
a), although not in xTFIIS.oA cDNAs due to a base substitution in exon 4. We
used oligonucleotides kp4 and kp5 (Fig.
3
b) to prime PCR across exons 4-7, using as templates the xTFIIS cDNAs present in three different phage
libraries. The first was the embryo library from which pe1 and pe2 were
isolated, the second the ovary library containing po2 and pcr59 cDNAs and the
third a different ovary library constructed in [lambda]gt10. Digestion of the PCR products with
Eco
RI should produce a diagnostic fragment of 220 bp only if the more downstream
splice site in intron 4 is utilized (Fig.
3
b) and this was verified using the parental phages of pe1 and po2 as controls
(Fig.
3
c, lanes 2 and 3). Digestion products indicative of the presence of the splice-dependent
Eco
RI site were obtained from all three libraries (Fig.
3
b, lanes 4-6) and this suggests that the alternative splice pathway has been
utilized in xTFIIS.oB transcripts in the ovary as well as embryo libraries. The
polypeptides produced from alternatively spliced forms of xTFIIS.oB mRNA will
differ by the presence or absence of a single serine preceding Ser110 and
Ser111.
In defining the boundaries of exon 6 and exon 7 we compared the xTFIIS.oA
genomic sequence with its cDNA in the form of pcr59. This comparison predicted
unambiguously that the first and last dinucleotides of intron 6 were AT and AC
(Fig.
4
a), seemingly at odds with the GT-AG consensus splice sites found in the overwhelming majority of pre-mRNA introns. However, recently a minor class of introns that
possess AT and AC at their 5'- and 3'-ends respectively has been identified in vertebrate
and invertebrate genes (
37
,
38
). Despite only four such introns having previously been described, it is
already clear that this class is also characterized by the possession of distinctive, extended consensus sequences at their 5' and 3' splice sites and an almost invariant sequence at a fixed distance
from the 3' splice site, thought to represent the branch site sequence (
39
,
40
). The relevant regions of xTFIIS.oA intron 6 conform well to these consensus
sequences (Fig.
4
b), demonstrating that this intron is indeed a further example of the minor (AT-AC) class.
Figure
Due to the rarity of AT-AC introns and their important evolutionary and functional implications,
we wanted to establish whether a minor class intron existed at the same
position in the homeologous gene xTFIIS.oB. Assuming that the introns would be
of similar size in both genes (~600 bp), we used primers kp11 and kp5, which are derived from exons 6 and 7
of the xTFIIS.oB cDNAs, to amplify genomic DNA and then cloned PCR fragments of
the predicted length. One of the cloned fragments, pcrX13, matched the
xTFIIS.oB cDNAs exactly over its exon 6/7 sequence and contained a presumed
intron 6 interrupting these exons at the same position as in xTFIIS.oA.
Comparison of the intron 6 sequence from pcrX13 with that of xTFIIS.oA (Fig.
4
b) shows that despite base changes and deletions/additions in the body of the
intron, both possess similar AT-AC intron consensus sequences, although they differ at one nucleotide in
the branch site consensus. Presumably the splicing signals of both introns 6
are active in oocytes because fully processed cDNAs derived from both genes are
present in ovary libraries (see above).
Examination of the remaining exon/intron boundaries of the xTFIIS.oA gene (Fig.
4
a) reveals that the 5' splice site of intron 8 is non-consensus, comprising a GC rather than GT dinucleotide. This base
change was confirmed unequivocally by sequencing on both strands with several
different primers. More than 20 other naturally occurring introns possess this
5' splice site variant (
37
). In these examples, the rest of the 5' splice consensus has a particularly good match to the prototype sequence
defined by complementarity to U1 snRNA; likewise, the rest of the 5' splice site of xTFIIS.oA intron 8 conforms exactly to the prototype and
so, apart from the T -> C change, it should be perfectly complementary to U1. The non-consensus splice site in xTFIIS.oA is clearly active, because clone
po1 contains a fully processed cDNA derived from this gene.
It also appears that the 3' RNA processing signals of xTFIIS.o genes contain variants of the
consensus polyadenylation hexanucleotide AATAAA. Alignment of the po1 3' untranslated region with the genomic sequence of xTFIIS.oA shows that
the site of polyadenylation is 33 bp downstream of the hexanucleotide GATAAA
(Fig.
4
c). Similarly, the polyadenylation signal used in transcripts of xTFIIS.oB
appears to be ATTAAA (Fig.
4
c). Both of these variants occurred in a survey (
41
) of naturally occurring 3' processing signals (1.1 and 12% of those surveyed respectively). We
think the use of multiple rare RNA processing signals in the production of
xTFIIS.o mRNAs has implications for the regulation of TFIIS gene expression,
which are considered below.
We have shown that two homeologous genes encoding isoforms of transcription
elongation factor TFIIS occur in the pseudotetraploid genome of
X.laevis
and that mRNAs of both are expressed in the ovary. Both the encoded
polypeptides show similar divergence relative to the general isoform of
mammalian TFIIS and so presumably both are equally likely to be functional.
Most of the amino acid sequence differences between the
Xenopus
gene products are in the N-terminal half of the polypeptide, which is dispensable for transcriptional
activity
in vitro
(summarized in 20). Hence, it will require the development of
in vivo
assay systems to answer the interesting question of whether the diverging
Xenopus
isoforms have acquired different functional specializations. However, our
comparisons of amphibian and mammalian general TFIIS isoforms with mammalian
testis isoforms has revealed a number of diagnostic amino acids in the C-terminal region that could potentially identify functional distinctions
between the two classes of TFIIS in existing
in vitro
assays.
We have also shown that nine introns and a number of unusual RNA processing
signals are present in xTFIIS.o genes. Although genomic sequences containing
several non-vertebrate TFIIS genes have been published or deposited in nucleotide
sequence databases (
21
,
42
; GenBank accession no. U20526, EMBL accession no. Z54216), those for vertebrate
TFIIS genes were previously limited to a single human example that is
apparently intronless (
26
). However, consideration of the locations of the TFIIS introns we have
described in
Xenopus
(Fig.
2
) strongly predicts that introns will be found in at least some of the TFIIS
genes expressed in mammals; among the three non-vertebrate TFIIS genes known to contain introns, the positions of both the
second intron in the
Schizosaccharomyces pombe
example (GenBank accession no. U20526) and the first intron annotated for a
Caenorhabditis elegans
TFIIS CDS (EMBL accession no. Z54216) exactly match that of the second intron
of xTFIIS.oA. Furthermore, two classes of TFIIS cDNAs isolated from human
libraries exhibit unusual structures relative to the apparently
bona fide
cDNAs that were obtained later and they have subsequently been assumed to be
cloning artefacts (
26
,
29
). Interestingly, the 63 bp in-frame deletion exhibited by one class (pHIIS44) corresponds exactly to
exon 2 of xTFIIS.oA, while the 49 bp insertion found in the other class
(pHIIS75) falls precisely between the boundaries of exons 2/3 defined in
xTFIIS.oA and is flanked by potential splice sites. Therefore, we think that
one human TFIIS gene must contain at least the first two introns defined in
Xenopus
and also that both classes of unusual cDNA represent types of misspliced mRNAs,
one in which exon 2 is skipped and one in which all or part of intron 2 is
retained.
Our analysis of xTFIIS.o genes has shown that unusual RNA processing signals are
involved in the splicing of exons 4/5, 6/7 and 8/9. In the first case a CAG/CAG
motif at the 3' boundary of intron 4 allows alternative splicing in which either the
first or second AG dinucleotide is utilized as the 3' splice site, hence, a CAG trinucleotide is provided or denied the mRNA
and this results in the presence or absence of a single serine in the mature
polypeptide. Similar motifs and alternative splicing scenarios have been noted
in several other genes, such as chicken apolipoprotein II (
43
) and human prothymosin [alpha] (
44
), although mature polypeptides are usually unaffected. We do not know whether
the reduction to just two adjacent serines in this region of the alternatively
spliced xTFIIS variant has functional consequences, but three or more adjacent
serines are conserved at this location in other examples of this vertebrate
TFIIS isoform and the N-terminal half of TFIIS is known to be subject to phosphorylation (
45
).
Other unusual RNA processing events are predicted to take place in introns 6 of
xTFIIS.oA and xTFIIS.oB. These introns contain sequences matching the 5' splice site, 3' splice site and branch site consensus sequences recently defined
for the class of rare AT-AC introns (
38
). The three other examples of these introns detailed in vertebrates at present
are intron 7 of the cartilage matrix protein gene, intron 6 of the gene
encoding proliferating cell nucleolar protein P120 and intron 6 of REP-3, a gene thought to encode a DNA repair protein. Taken together with the
xTFIIS examples it is intriguing that such introns appear to occupy positions 6
or 7 within their respective transcription units; this may be a coincidence
resulting from the low numbers so far recognized and, even if not, presumably
reflects a vertebrate-specific feature because a further example of these introns has been
described as the second of three in the
prospero
gene of
Drosophila
. It is also not readily apparent from this list whether the occurrence of AT-AC introns in a particular set of genes has an underlying logic and, of
course, it may simply be explained as a historical accident of intron
insertion. However, a possible common factor could be the generation of a novel
means for regulating expression of these genes. It has recently been
established that splicing of AT-AC introns involves distinct and correspondingly rare snRNAs, U11 and U12
(
39
,
40
), as well as some of the components of the major splicing machinery. The low
abundance of U11 and U12 snRNPs is reflected in slow assembly of these splicing
complexes
in vitro
and has raised the question of how they are able to compete
in vivo
with the major splicing apparatus for the shared components required for
efficient removal of AT-AC introns (
39
). This question is of particular relevance for considerations of xTFIIS.o gene expression because of the other non-consensus processing signals it possesses.
We have found that the 5' splice site of xTFIIS.oA intron 8 is GC rather than the consensus GT.
Although some GC splice sites have been shown experimentally to be functional,
they are cleaved more slowly than the GT sequence
in vitro
(
46
) and rank poorly in efficiency of usage
in vivo
(
47
), even in the context of a good match to U1 snRNA at other positions of the
splice site. Leaving aside the alternative splicing exhibited by exon 5, it
would appear that the presence of rare splice sites in two introns in a single
gene would conspire to make the production of mature xTFIIS.o mRNAs an
inefficient process. This tendency will be compounded by the presence of
variant 3' processing signals (Fig.
4
c). The GATAAA and ATTAAA variants of the consensus polyadenylation
hexanucleotide found for xTFIIS.oA and xTFIIS.oB respectively have been shown
to be functional but less efficient signals for both 3' cleavage and polyadenylation
in vitro
(
41
). However, there are circumstances in which apparently weak 3' RNA processing signals can be utilized efficiently and which suggest a
possible scenario for the effective use of the multiple weak processing signals
of xTFIIS.o
in vivo
. Potentiation of inefficient 3' processing signals has been achieved in experimental assays by the
inclusion of transcriptional pause signals immediately downstream of the
processing site (
48
,
49
). This effect is demonstrable when a weak processing signal has to compete
either against a stronger, downstream 3' processing signal or against a tendency to splice when the weak signal
is included in an intron. A simple rationale to explain these effects is that
transcriptional pausing allows more time for the weaker signal to operate in
the nascent transcript before competing RNA processing signals are transcribed
by the advancing transcription complex. We have argued that a natural example
of the potentiation of 3' processing by transcriptional pausing accounts for the efficient use of
a weak 3' processing signal by a
Xenopus
tubulin gene (
50
), although it is not clear whether direct competition between signals necessarily underlies the effect
in vivo
.
Although to our knowledge there is no analogous evidence that the rate of
transcription elongation affects the efficiency of splicing, it is clear that
at least in amphibian oocytes the splicing machinery does assemble on nascent
transcripts (
51
) and for some genes a variety of evidence suggests the occurrence of co-transcriptional splicing (
52
-
54
). It seems possible then that weak or unusual splicing signals may be used more
efficiently when the elongation complex is arrested, perhaps because this
retards the production of competing downstream splice signals, some of which
might otherwise be cryptic. According to this scenario, in situations where the
overall elongation rate is high, weak splice sites would be used inefficiently,
and this could lead, for instance, to the production of aberrant transcripts of
the type that have been described for mammalian TFIIS cDNAs. It would be of
interest to determine whether in general the rates of production of mRNAs that
require the utilization of rare or weak RNA processing signals can be
influenced by, or linked to, aspects of transcription elongation.
Interestingly, the occurrence in TFIIS genes of the TFIIS-sensitive arrest sites described in many eukaryotic genes (
19
) could provide a novel mechanism for TFIIS autoregulation
in vivo
, i.e. efficient read through of arrest sites at high cellular levels of TFIIS,
although stimulating overall rates of transcription elongation on TFIIS genes,
could actually result in decreased production of its mRNA because of the
inefficient use of multiple unusual RNA processing signals.
We are grateful to John Sommerville, Richard Harland and Tom Sargent for gifts
of libraries and to Marion Hamshere for fibroblast RNA. Many thanks also to
Paul Sharp for help with sequence analysis and to Evelyn Gurd for excellent
technical assistance. This work was supported by the Wellcome Trust (Prize
Studentship to KEP) and by an MRC project grant.


REFERENCES
Return
