ABSTRACT
The major surface antigen of procyclic and epimastigote forms of
Trypanosoma congolense
in the tsetse fly is GARP (glutamic acid/alanine-rich protein), which is thought to be the analogue of procyclin/PARP in
Trypanosoma brucei
. We have studied two
T.congolense
GARP loci (the 4.3 and 4.4 loci) whose transcription is
[alpha]
-amanitin sensitive. Whilst a transcriptional gap 5
'
of the first GARP gene in the cloned region of the 4.4 locus could not be
detected, such a gap was present in the 5
'
flank of the first GARP gene in the 4.3 locus. We have located a GARP
transcription start site and, using reporter gene constructs containing a
putative GARP promoter region in transient transfection studies, we have
demonstrated promoter activity for the test region in
T.congolense
. There are species-specific differences in sequences regulating expression of the two major
surface antigens, GARP and procyclin/PARP: the GARP promoter is inactive in
T.brucei
while the procyclin/PARP promoter is inactive in
T.congolense
. We have defined the splice acceptor site for the 4.3 GARP gene by sequencing
and by 5
'
RT-PCR and demonstrated microheterogeneity in GARP polyadenylation by 3
' RT-PCR. It appears that some GARP and procyclin/PARP RNA processing signals,
although similar, are also species-specific.
Gene organisation in parasitic protozoa of the order Kinetoplastida is unusual
among eukaryotes in that genes are usually found grouped together in
polycistronic transcription units (
1
). Each polycistronic transcription unit has a single 5' promoter, and polycistronic pre-mRNAs are processed by the functionally-linked mechanisms of 5'
trans
-splicing and 3' polyadenylation (
2
-
5
). In many cases genes within a polycistronic transcription unit may be
differentially expressed, and therefore regulation of gene expression must take
place largely at the post-transcriptional level (
6
). Investigation of the mechanisms regulating gene expression in trypanosomes
has focused on the African trypanosome
Trypanosoma brucei
; very little is known about gene expression in other trypanosomatids and
whether this is similar to or different from that in
T.brucei
.
Another African trypanosome,
Trypanosoma congolense
(subgenus
Nannomonas),
has a life cycle similar to that of
T.brucei
(subgenus
Trypanozoon
), with two main phases, one in a mammalian host and the other in the insect
vector, the tsetse fly. When
T.brucei
bloodstream forms enter the tsetse fly midgut, differentiation to the procyclic
stage occurs (
7
). Concomitant with this transformation, parasites lose the variant surface
glycoprotein (VSG) coat, continual switching of which allows antigenic
variation in the bloodstream (
8
) replacing it with a new, densely packed surface coat composed of
procyclin/PARP (
9
-
12
). Procyclin/PARP is retained on the surface of insect stage parasites
(procyclic and epimastigote) until differentiation to the metacyclic stage
occurs in the salivary glands of the fly (
12
,
13
). It has been suggested that acquisition of procyclin/PARP might serve to
protect the procyclic form from the hostile environment of the tsetse fly
midgut and may also have a role in directing differentiating parasites from the
midgut to the salivary glands (
14
,
15
). The analogue of procyclin/PARP in
T.congolense
is GARP (glutamic acid/alanine-rich protein) (
13
,
15
). GARP displays several features similar to those of procyclin/PARP: the
protein is acquired during differentiation from bloodstream to procyclic
trypanosomes; it is abundant on the procyclic and epimastigote cell surface (
13
); it is acidic and glycosylated (
13
,
15
); and is attached to the cell membrane by a GPI anchor (Hulsmeier
et al.
, unpublished). However, GARP has almost twice the predicted polypeptide
molecular mass for procyclin/PARP and has a very different amino acid sequence.
DNA fragments homologous to two genomic loci containing at least one GARP gene
each have been cloned and characterised (
15
) (Fig.
1
). Sequence analysis has shown that GARP genes display no homology to
procyclin/PARP genes except for a 16 nt motif found in the 3' untranslated (UTR) region (
15
).
Cloned lines of
T.congolense
TREU1457 and 1/148 were used in these studies as described previously (
15
). Procyclic populations were grown in Eagle's MEM supplemented with 2 mM
Glutamax (Gibco-BRL) and 20% foetal calf serum (JRH Biosciences) at 27oC, in 5% CO
2
.
The plasmid subclones generated to perform this work are illustrated in Figure
1
, Figure
3
A and B and Figure
4
A. The routes of construction of each subclone and the specific oligonucleotides
used to amplify the inserted DNA in p4.35'flank+SA (Fig.
1
) and the inserted DNA in p4.35'flank (Fig.
4
A) are available upon request. All constructs are in pBluescript (KS
-
) except p4.3garp3'flank which is in pGEM3. The GARP 3'UTR included in pgarp3'utr was derived from the P4 cDNA clone which was isolated
from a different stock (1/148) of
T.congolense
than the genomic DNA-derived subclones (TREU 1457) used in transcriptional analysis. However,
the 3'UTR and 620 bp of GARP downstream intergenic region were subsequently
subcloned from [lambda]4.3 to give p4.3garp3'flank, and sequencing showed a 96% identity between the sequence
of the two UTRs derived from the different stocks.
The recombinant plasmids for transient transfection studies were all derivatives
of pJP44, a
T.brucei
expression construct, which contains, in a 5' to 3' direction, the PARP B promoter, a PARP splice acceptor site, a
chloramphenicol acetyl transferase (CAT) reporter gene and the 3' end of the PARP B [alpha] gene to provide polyadenylation signals (
16
). The PARP promoter region was the 278 bp
Kpn
I-
Sma
I fragment; the splice acceptor region was the 90 bp
Sma
I-
Hin
dIII fragment and the 3'UTR was the 360
Bam
HI-
Pst
I fragment (
16
). For the GARP constructs the 1095 nt region containing the putative GARP
promoter and splice acceptor region was amplified by PCR using the 5'E/Pv
Bam
HI oligonucleotide (5'-CGCGGATCCACTATCCTCCAACATGTG-3') (Fig.
4
A) and 3'5'congoprom oligonucleotide (5'-AGCTTCGTTGCACAATGTGTG-3') (Fig.
4
A) with
Pfu
DNA polymerase (Stratagene). The 3'UTR was the 465 bp
Bam
HI-
Kpn
I fragment at the 3' end of the GARP cDNA clone P4 (the
Kpn
I site is in the plasmid polylinker downstream of the 3' insertion site).
Other plasmid clones used were L29, a
T.congolense
ribosomal protein cDNA clone (R. Bayne, unpublished) pPRO2001, a
T.brucei
procyclin/PARP cDNA clone (
9
), pTb[alpha][beta]-T1, a
T.brucei
plasmid clone containing an [alpha]- and [beta]-tubulin repeat unit (
24
), pR4 a
T.brucei
ribosomal DNA repeat unit (
25
) and pActine, containing a
T.brucei
actin gene (
26
).
Sequencing was carried out on denatured double-stranded plasmid DNA using the dideoxy chain termination method either
conventionally (Sequenase kit: Amersham International) or by polymerase chain
reaction cycle sequencing on an Applied Biosystems automated sequencer.
Sequence for both strands of recombinant plasmids was obtained using the
recommended primers for pBluescript or specific primers synthesised on an
Applied Biosystems PCR-mate oligonucleotide synthesiser. Computer analysis was carried out using
the GCG sequence analysis software package.
Preparation and storage of nuclei and run-on reactions were carried out exactly as described (
25
,
27
). Procyclic run-on reactions were at 27oC, using [alpha]-amanitin at a concentration of 500 [mu]g/ml in methanol. Nuclei (10
9
/reaction) were pre-incubated with the drug in nuclei storage buffer for 10 min on ice (
27
). Hybridisations were carried out at 55oC in 3* SSC for 48 h and washes were to 0.1* SSC, 0.1% SDS at 65oC.
Standard procedures were used for DNA preparation, gel electrophoresis and
Southern blotting onto Hybond N membrane (Amersham International plc).
Immobilisation of nucleic acids onto filters was by UV irradiation. RNA was
prepared by lithium chloride/urea lysis of trypanosomes followed by phenol
extraction (
28
). Following DNase I treatment for 1 h in the presence of 100 mM NaCl, 6 mM MgCl
2
and removal of the enzyme by phenol extraction, RNA was fractionated by
electrophoresis on denaturing formaldehyde gels following denaturation of 5 [mu]g total RNA by incubation for 10 min in the presence of 50% formamide, 2.2 M
formaldehyde (
29
). RNA was Northern blotted directly onto Hybond-N membrane and immobilised on the filter by UV irradiation. Radiolabelled
probes were prepared by either random hexanucleotide priming of restriction
fragments separated by electrophoresis in low melting point gels (
30
) or by
in vitro
transcription of the CAT gene cloned into the vector pBluescript (Stratagene
protocol handbook). Hybridisation with the random primed probes was carried out
at 42oC in 50% formamide 5* SSC (1* SSC is 150 mM NaCl, 0.015 mM Na citrate), 5* Denhardt's solution, 0.1% SDS, 100 [mu]g/ml herring sperm DNA and blots were washed to 3* SSC or 0.1* SSC, 0.1% SDS at 65oC. Hybridisation with the
in vitro
transcribed probes was carried out at 55oC in 50% formamide, 5* SET (1* SET is 150 mM NaCl, 10 mM Tris-HCl pH 7.5 and 1 mM EDTA), 5* Denhardt's solution, 50 [mu]g/ml tRNA, 0.5% SDS and washed at 65oC in 0.1* SET, 0.1% SDS. Removal of
hybridised probes was carried out as detailed in the Hybond protocol. Following
removal of probes filters were autoradiographed to check that no residual
hybridisation remained.
In vitro
transcription was carried out using T3 and T7 RNA polymerases, and the
pBluescript recombinant clone, p4.35'flank as described in the Stratagene pBluescript protocol handbook. RNase
protection and fractionation of protected fragments was carried out by standard
means (
29
). Total RNA (10 [mu]g) was hybridised with radiolabelled RNA probes in 80% formamide, 40 mM
PIPES, pH 6.4, 400 mM NaCl, 1 mM EDTA at 50oC overnight. Hybrids were digested with RNaseA (10 [mu]g) and RNase T
1
(3 U) for 30 min at 30oC. Protected fragments were fractionated by electrophoresis in a 6%
acrylamide, 7 M urea gel, followed by autoradiography.
Supercoiled, CsCl-purified plasmid DNA (5 [mu]g for CAT assays or 50 [mu]g for RNA extractions per transfection cuvette) was electroporated
into procyclic culture cells exactly as described (
16
,
31
,
32
) with a single pulse of 1500 V, 25 [mu]F capacitance from a BioRad Gene Pulser. Following electroporation parasites
were transferred to 5 ml Eagle's MEM, 20% foetal calf serum per cuvette and
cultured overnight (CAT assays) or 5 h (RNA extractions) at 27oC in 5% CO
2
. CAT reactions were for 2 h at 37oC and assays were by xylene extraction (
32
,
33
). Transfections were performed in replicate. RNA was prepared from transiently
transfected cells by lysis in 3 M LiCl/6 M urea followed by phenol extraction (
28
). Prior to Northern blot analysis RNA was DNase I-treated by incubating up to 50 [mu]g RNA in 100 mM NaCl, 50 mM Tris-HCl pH 8.0, 10 mM MgCl
2
, 40 U RNasin (Promega) and 100 [mu]g/ml DNase I (RNase-free, Life Technologies) in a final volume of 100 [mu]l for 1 h at 37oC followed by phenol/chloroform extraction and ethanol
precipitation.
Using a 5' RACE kit purchased from Life Technologies, 5' RT-PCR was carried out exactly as described in the protocol.
The primer for first strand synthesis was GARPgsp1 (5'-GCAGTGTGACCGCCATTAAGTGTAG-3') (Fig.
4
A) which is homologous to sequences 52-27 bp downstream of the start codon for the GARP gene. The cDNAs were
purified from primer and unincorporated nucleotides then tailed with an oligo-dC anchor. The first round of amplification was carried out with
oligonucleotide GARPgsp2 (5'-CGTTGCACAATGTGTGAAGAGGAGC-3') (Fig.
4
A) which is homologous to sequences 62-88 upstream of the start codon for the GARP gene and the anchor primer
supplied with the kit, that contains an oligo-dG anchor region attached to a universal amplification primer region. A
second round of PCR was carried out using the oligonucleotide GARPgsp3 (5'-CAAGCAGCGAGCGTGGCG-3') (Fig.
4
A) which is homologous to sequences 103-120 upstream of the start codon for the GARP gene, and the universal
amplification primer supplied with the kit. PCR amplification was performed for
35 cycles of 30 s at 94oC, 1 min at 55oC (first round of PCR) and 60oC (second round of PCR), 1 min at 70oC in a final volume of 50 [mu]l containing 50 mM KCl, 10 mM Tris-HCl (pH 8.3), 2.5 mM MgCl
2
, 100 [mu]g/ml BSA, 100 pmol of each primer. PCR products were resolved by gel
electrophoresis in 1.5% agarose. PCR products were cloned into a T-vector system (Promega), and recombinant plasmids were sequenced using the
dideoxy chain termination method (Sequenase kit: Amersham International).
First strand cDNA was synthesised from total RNA isolated from procyclic
T.congolense
using reverse transcriptase and the oligo [dT]-anchor primer PWM5ANC (5'-CGGTGGCAGCAGCCAACTTTTTTTTTTTT-3') (
3
). For determining the wild-type polyadenylation sites for GARP RNAs, cDNAs were PCR-amplified with oligonucleotides PWMEco (5'-CGAGAATTCGGTGGCAGCAGCCAACT-3') an
Eco
RI-tailed anchor primer homologous to the oligo [dT]-anchor primer (
3
) and GARPSG1 (5'-CAGATGGTGCCCGTGCCGTGCTGAC-3') located 80 nt 5' of the GARP stop codon. One further round of
amplification was performed with PWMEco and GARPSG2 (5' GAGGCGGGATCCCCCAGCTCA 3') located immediately 3' of the GARP stop codon (Fig.
4
B). For analysis of the polyadenylation site of CAT transcripts expressed from
transiently transfected
T.congolense,
first strand CAT cDNAs were synthesised from total RNA isolated from
T.congolense
cells transiently transfected with p5'garpCAT3'garp, or p-CAT3'garp as a negative control, and were hybrid-selected prior to amplification, using CAT DNA
fragments bound to nylon membrane, to exclude recombination with endogenous
GARP transcripts. CAT/GARP chimaeric cDNAs were amplified using
oligonucleotides PWMEco (5'-CGAGAATTCGGTGGCAGCAGCAACT-3') (
3
) and CATSG4 (5'-GCCCGCCTGATGAATGCTCATCCGG-3'), 470 nt upstream from the CAT stop codon. A second
round of amplification was carried out by nested PCR with oligonucleotide
CATSG1 (5'-TGGCAGGGCGGGGGTAA-3') 18 nt upstream from the CAT stop codon and PWM5Eco.
PCR amplifications were performed for 35 cycles of 30 s at 94oC, 1 min at 60oC and 1 min at 70oC in a final volume of 50 [mu]l containing 50 mM KCl, 10 mM Tris-HCl (pH 8.3), 2.5 mM MgCl
2
, 100 [mu]g/ml BSA and 100 pmol of each primer. PCR products were resolved by gel
electrophoresis in 1.5% agarose.
Previously we had isolated three lambda clones ([lambda]4.3, [lambda]4.4, [lambda]4.5) containing sequences homologous to cDNA P4 which
encodes the GARP protein (
15
). Two cDNA P4-homologous regions are contained in [lambda]4.3, while [lambda]4.4 and [lambda]4.5 contain one such region each. Partial
sequencing had shown that sequence identity was very high between the genes
from the 4.3 and 4.4 loci, at least in the 3' region of the open reading frame (
15
). Further sequence analysis has now demonstrated a high degree of sequence
similarity between the 5' ends of the GARP genes from each locus (Fig.
5
). Subsequent analysis of the genomic clones showed that [lambda]4.5 was derived from the same locus as [lambda]4.3 (data not shown). Partial maps of the 4.3 and 4.4 GARP loci
are shown in Figure
1
.
We synthesised
32
P-labelled nascent transcript probes by
in vitro
run-on using nuclei pre-incubated with [alpha]-amanitin for 10 min on ice (Fig.
2
, + [alpha]-amanitin) or methanol, the solvent for the drug (Fig.
2
, - [alpha]-amanitin). The nascent transcript probes were hybridised to
identical Southern blots of restriction digests of L29, a
T.congolense
cDNA clone encoding a ribosomal protein (lanes 1 and 5) (R. Bayne,
unpublished), the P4 GARP cDNA clone (
15
) (lanes 2 and 6), a
T.brucei
[alpha]- and [beta]-tubulin DNA repeat unit (
24
) (lanes 3 and 7) and a
T.brucei
ribosomal DNA repeat unit (
27
) (lanes 4 and 8). The P4 GARP cDNA insert hybridised only with the probe
synthesised in nuclei which had not been pre-incubated with [alpha]-amanitin (Fig.
2
, lane 6). There was no detectable hybridisation when the cDNA insert was
hybridised with transcripts from [alpha]-amanitin-treated nuclei (Fig.
3
, lane 2). Hybridisation to
T.brucei
tubulin sequences was decreased also in the presence of the drug (Fig.
3
, compare lanes 3 and 7) showing that [alpha]-amanitin was inhibiting RNA polymerase II. There was no decrease in
RNA polymerase I transcription of ribosomal DNA (lanes 4 and 8) as expected.
Transcription of
T.congolense
L29 ribosomal protein gene(s), however may be insensitive to the drug since we
found consistently that transcription of L29 sequences was not completely
inhibited by [alpha]-amanitin (lanes 1 and 5).
We performed nuclear run-on analysis to determine whether there was a transcriptional gap, and
therefore a putative promoter located 5' of the 4.3 or 4.4 GARP loci. The
32
P-labelled nascent transcripts run-on in nuclei isolated from procyclic
T.congolense
were hybridised to Southern blots of a
Hin
dIII digest of pE/Pv4.4 (the inserted DNA in this subclone is a
Pvu
II fragment containing the
Eco
RI-
Pvu
II fragment shown in Figure
1
but flanked 5' with a
Pvu
II-
Eco
RI fragment from the very 3' end of the left hand arm of the lambda clone. The DNA is inserted in the
Hin
cII site in pBluescript, thus the second
Hin
dII site is in the 5' polylinker of the plasmid) (Fig.
3
B and E) or to a
Bst
XI digest (there is a
Bst
XI site in the 5' polylinker of the plasmid) of pE/Pv4.3 (Fig.
3
A and C). For the 4.4 locus, hybridisation was detected to both
Hin
dIII fragments (Fig.
3
F) indicating that transcription occurs across this region. The larger
Hin
dIII fragment contains 2960 bp of plasmid sequence plus 1070 bp of GARP-related sequence, and we cannot discount the possibility that there may be
another gene 5' of the GARP gene in the 4.4 locus but part of another transcription
unit. In this case a small gap in transcription may be present within this
fragment which would not be detected in this crude assay. For the 4.3 locus we
detected hybridisation only to the larger
Bst
XI fragment (2.96 kb of plasmid DNA and 728 bp of GARP-related sequence) (Fig.
3
D). The run-on probe did not hybridise to plasmid DNA alone (data not shown). Since we
did not detect hybridisation to the smaller, upstream fragment (572 bp), a
transcriptional gap must exist upstream of the 4.3 GARP locus, and it is
reasonable to assume that transcription of the 4.3 GARP locus initiates, and a
promoter for GARP gene transcription may be present, within the cloned region 5' of the 4.3 GARP genes around or downstream of the
Bst
XI site.
Having located a transcriptional gap 5' of the 4.3 5'-most GARP gene, we localised the transcription initiation
site by RNase protection (Fig.
4
) and 5' RT-PCR. For RNase protection analysis we used the recombinant
pBluescript plasmid template, p4.35'flank. The inserted DNA in this subclone is homologous to the 1.1 kb of
sequence flanking the splice acceptor site in [lambda] 4.3 (Fig.
5
A) and was amplified by PCR using primers 5'E/Pv
Bam
HI and 3'E/Pv
Eco
RI (Fig.
5
A) to generate a
Bam
HI site at the 5' end and an
Eco
RI site at the 3' end. This subclone was used to synthesise
in vitro,
labelled sense transcripts (T7 RNA polymerase), or antisense transcripts (T3
RNA polymerase) homologous to the putative 5' end of the GARP transcription unit. The sense and antisense
radiolabelled transcripts were hybridised to either total
T.congolense
RNA or to
Escherichia coli
tRNA. Following RNase digestion of hybrids and fractionation of digestion
products on a sequencing gel, a major protected fragment of around 440 bases
and a minor product at around 330 bases were observed when the antisense T3 RNA
polymerase-synthesised transcripts from p4.35'flank were hybridised with total
T.congolense
RNA (Fig.
4
B, lane 1) but not when the same transcripts were hybridised with
E.coli
tRNA (Fig.
4
B, lane 2). Hybridisation of T7 RNA polymerase-synthesised sense transcripts with either total
T.congolense
RNA (Fig.
4
B, lane 3) or
E.coli
tRNA (Fig.
4
, lane 4) gave no protected product, as expected. If the large protected
fragment represents a primary transcript for the 5' end of the 4.3 GARP locus then this places the transcription initiation
site around 460 bp upstream of the start of the cDNA. The minor protected
fragment may be a degradation product of the major fragment; it may represent
transcription from a secondary promoter; or it may be a fragment from another
GARP locus which we have not yet identified, that has a shorter region of
homology with 5' sequences of the 4.3 GARP locus.
Next we used 5'RT-PCR to confirm this result and to locate the specific
nucleotide(s) where transcription initiated. The oligonucleotide used to direct
first strand cDNA synthesis was a 25mer homologous to nucleotides 50-25 3' of the ATG start codon of the 4.3 GARP gene (GARPgsp1; Fig.
5
A). Two major products were obtained, one at around 130 nt corresponding to the
spliced mature RNA and one at around 550 nt (data not shown). Two further
oligonucleotides (GARPgsp2, gsp3) were used to prime two subsequent rounds of
PCR, and these are indicated by long arrows below the sequence in Figure
5
A. A single major PCR product was obtained (around 450 bp) and amplified DNAs
were cloned and sequenced. All five clones sequenced gave the same initiation
site corresponding to a G residue 466 nt upstream of the cDNA start site
indicated by the small arrowhead in Figure
5
A. There are no significant open reading frames within the entire 1.1 kb
fragment 5' of the cDNA start and stop codons are present in all frames.
The site for addition of the spliced leader sequence was predicted to be about
20 bp upstream of the start of the cDNA homologous to the GARP gene in the 4.4
locus (
15
). RNase protection studies and 5'RT-PCR indicated that a spliced leader addition sequence was located
at the same distance upstream of the first GARP gene in the 4.3 locus (AG in
bold type at position 1088 in Fig.
5
A) (data not shown). Sequencing upstream of the AG dinucleotide splice acceptor
site for both loci revealed that there was a 258 bp region with 96% sequence
identity between the two GARP gene loci (running 3' from the square bracket in Fig.
5
A) 5' of which the homology dropped to 48% identity. Since these homologous
sequences contain the splice acceptor sites for the 5' GARP genes in both loci it is probable that these regions contain the
sequences necessary to direct
trans
-splicing of the GARP genes in
T.congolense
. We used 3'RT-PCR to determine where GARP transcripts were polyadenylated.
Figure
5
B shows the 3' UTR and some 350 bp of intergenic region downstream. First strand cDNA
synthesis was primed with oligo [dT] then two nested oligonucleotides, the
first homologous to a sequence at the 3' end of the GARP coding region and the second homologous to a sequence at
the 5' end of the 3' UTR of GARP (Fig.
5
B) were used to prime two subsequent rounds of PCR. A single major PCR product
was obtained and amplified DNAs were cloned and sequenced. We found a range of
sites of polyadenylation within a 25 nt region in the 3' UTR of GARP mRNAs which was located 468-493 nt downstream of the translation stop codon (arrowheads in
Fig.
5
B). Sequencing of part of the intergenic region downstream revealed a
distribution of sequence motifs (underlined and circled in Fig.
5
B) similar to those found for intergenic regions in
T.brucei
(
4
).
In order to test whether the GARP putative promoter region was functional and to
determine whether sequences involved in regulating GARP gene expression in
T.congolense
could be recognised in
T.brucei,
we carried out a series of transient transfection assays (Fig.
6
). We compared the ability of the procyclin/PARP promoter and the putative GARP
promoter to drive expression of a CAT reporter gene in the constructs shown in
Figure
6
A and B. When we transiently transfected
T.congolense
cells with the construct p5'garpCAT3'garp (644 bp of sequence 5' of the transcription start site and the entire putative
splice acceptor site with the insert of pgarp3'utr as the 3' UTR) we could not detect CAT activity, even with a range of
protease inhibitors included in the transfection buffer and the cell lysis
buffer, but CAT RNA was readily detectable (Fig.
6
A, lane 4). This indicated that we had indeed identified a region of sequence 5' of the first GARP gene in the 4.3 locus which could act as a promoter in
this assay. Either CAT enzyme is highly unstable in
T.congolense
cells or the CAT RNA is not able to be translated. Thus, rather than assaying
CAT enzyme activity we analysed the abundance of CAT transcripts in the
transiently transfected cells. The GARP constructs numbered 1, 2 and 4 in
Figure
6
A contained a GARP 3' UTR isolated from cDNA P4. In
T.brucei
it has been shown that, in transient transfection experiments, inclusion of
only the 3' UTR of a procyclin/PARP cDNA downstream of the CAT gene is not
sufficient to specify positionally accurate polyadenylation of CAT transcripts
(
4
,
5
,
20
). However, it does allow polyadenylation of PARP transcripts but at a site
around 100 bases 5' of the site used
in vivo
(
20
). For
T.congolense
we found by 3' RT-PCR that the 3' UTR used in the transient transfection studies did direct
polyadenylation but at a novel site 120 bases 5' of the polyadenylation site used in the cDNA clone P4. Inclusion of a
further 620 bp 3' of the endogenous polyadenylation site which includes part of the
intergenic region flanking the next GARP gene downstream had no effect on CAT
RNA abundance in transiently transfected cells (data not shown), but still did
not allow us to detect CAT enzyme activity. This intergenic region includes
several sequences similar to the types of signal (an extensive polypyrimidine
tract, an AG dinucleotide putative splice site flanked 3' by a short polypyrimidine tract, Fig.
5
B) which have been shown to be important for regulation of polyadenylation in
T.brucei
(
4
).
Figure
As negative controls for promoter activity we used the same plasmids from which
we removed the promoter regions but retained the splice acceptor regions: for
GARP this was the region downstream of the square bracket in Figure
5
A (p-CAT3'congo; Figure
6
A lane 2). For PARP the splice acceptor region was the
Sma
I-
Hin
dIII fragment from pJP44 (p-CAT3'parp; Fig.
6
B lane 6). No CAT transcripts were detected in either
T.congolense
or
T.brucei
cells transiently transfected with these constructs (Fig.
6
C, lane 2, Fig.
6
D, lane 6). Next we asked if the procyclin/PARP promoter could direct CAT
expression in transient transfection of
T.congolense
and conversely, if the GARP putative promoter region was operative in driving
CAT gene expression in transient transfection of
T.brucei
(Fig.
6
A construct 1, Fig.
6
B construct 5). Initial studies using p5'parpCAT3'parp to transiently transfect
T.congolense
and p5'garpCAT3'garp to transiently transfect
T.brucei
yielded no CAT RNA (data not shown). Similarly, Figure
6
C lane 1 shows that steady-state levels of CAT transcripts were not produced using the PARP
promoter/splice acceptor site in p5'parpCAT3'garp in
T.congolense
transient transfection. Figure
6
D lane 5 shows that no CAT transcripts were detected in
T.brucei
cells transiently transfected with the GARP promoter construct p5'garpCAT3'parp. Finally, we had observed that the 3' UTRs of GARP and procyclin/PARP transcripts shared a 16mer
motif at approximately the same distance upstream of the poly(A) addition site
(
15
). This suggested that in both species, a similar mechanism might operate to
regulate procyclin/PARP and GARP gene expression mediated through the 3' end of the mRNAs. It was therefore possible that sequences at the 3' end of the gene were not entirely species-specific. To test this possibility we exchanged the 3' UTRs of the GARP and procyclin/PARP cDNAs in the
constructs p5'parpCAT3'parp and p5'garpCAT3'garp to give p5'parpCAT3'garp and p5'garpCAT3'parp. The result shown
in Figure
6
C, lane 3 indicates that replacement of the GARP gene 3' UTR with the corresponding region of the procyclin/PARP gene does not
allow efficient expression of CAT in transient transfection experiments in
T.congolense
. Similarly, Figure
6
D, lane 7 indicates that replacement of the procyclin/PARP 3' UTR by the GARP gene 3' end does not allow efficient expression of CAT in transient
transfection of
T.brucei
. For all these experiments, replicate transient transfection experiments were
always very reproducible. We also assayed CAT activity in
T.brucei
, transiently transfected with the constructs shown in Figure
6
B and obtained results consistent with those obtained by measuring CAT RNA
abundance in Figure
6
D. Rehybridisation of the Northern blots in Figure
6
C and D with a
T.brucei
actin probe showed that failure to detect CAT transcripts in tracks 1-3 and 5-7 was not due to lack of RNA in each track (Fig.
6
E and F). These experiments demonstrate that in both flanks there are
significant species-specific differences in sequences regulating gene expression in
T.brucei
and
T.congolense
.
We have shown that transcription of GARP genes in
T.congolense
is sensitive to [alpha]-amanitin. We have identified a gap in transcription upstream of the
5'-most GARP gene in the 4.3 locus and localised a transcription
initiation site for this gene. The putative promoter thus defined appears to be
able to drive transcription of a CAT reporter gene when the gene is flanked 5' by a GARP gene splice acceptor site and 3' by a GARP gene 3' UTR. This is the first report of the cloning and
characterisation of a promoter for a gene in
T.congolense
and the first identified promoter in trypanosomes which directs RNA polymerase
II-like transcription.
The GARP putative promoter region has no significant homology with any other
T.brucei
promoter sequence and especially with the procyclin/PARP promoter. It is
located much further upstream (504 bp) of the GARP start codon in
T.congolense
than the promoter reported for procyclin/PARP genes in
T.brucei,
which is around 86 bp upstream of the start codon of the first gene in the PARP
A locus (
16
). The AG dinucleotide is located 60 nt upstream of the translation start codon
for both the 4.3 and 4.4 GARP loci while for the 5' procyclin/PARP gene in the PARP A locus the distance is only 30 nt (
34
). In many eukaryotes the splice acceptor site at an AG dinucleotide is preceded
5' with a pyrimidine-rich tract and it has been shown that this is also the case for
some kinetoplastid genes (
2
,
3
,
35
,
36
) including procyclin/PARP genes where there is a 26/29 pyrimidine tract very
close to the splice acceptor site (
34
). We have not observed extensive polypyrimidine tracts within the putative
splice acceptor regions for the GARP loci we have studied, although for the 4.3
locus the region 5' of the splice acceptor site is 66% TC-rich over the first 100 nt. There is a 9-pyrimidine tract 17 nt upstream of the splice acceptor site
and two further short pyrimidine tracts (>5 nt) 31 and 107 nt upstream
(overlined, Fig.
5
A). Mutational analysis will be necessary to determine the importance of these
sequences in
trans
-splicing in
T.congolense
. However, polypyrimidine motifs may not be entirely necessary since experiments
with deletion mutations in the dihydrofolate reductase-thymidylate synthase/DST intergenic region of
Leishmania major
have shown that splice acceptor sites lacking a strong polypyrimidine tract
immediately upstream can still be used efficiently (
2
).
We have also mapped the polyadenylation sites for the GARP genes in
T.congolense
. While procyclin/PARP transcripts have a single major site of polyadenylation (
4
,
5
) GARP transcripts appear to be polyadenylated differentially over a region of
20 bases. We cannot rule out the possibility that the different sites are
specific to different individual GARP genes. Such microheterogeneity has also
been observed for genes in
Leishmania
(
2
,
37
-
40
) and
T.brucei
(
3
,
41
) but in the absence of information on other genes in
T.congolense
it is not possible to determine whether this is a feature of polyadenylation in
this species or whether it is peculiar to GARP transcripts. Recent studies have
indicated that accurate polyadenylation of transcripts from polycistronic
transcription units in Kinetoplastida is dependent on sequence motifs, located
downstream of the gene sequence, in the intergenic region (
2
-
5
). One study where the nt sequences from several intergenic regions in
T.brucei
were compared, revealed a similar organisation of related motifs at a fixed
distance downstream of the polyadenylation sites for each gene (
4
). Of the four elements identified which were proposed potentially to contribute
to specification of accurate polyadenylation three are also present in the
intergenic region downstream of the 5'-most GARP gene in the 4.3 locus in
T.congolense
. These are (i) an intervening sequence of 80 nt between the poly(A) addition
site and (ii) a polypyrimidine tract followed by the trinucleotide YAG, and
(iii) a further polypyrimidine tract a short distance downstream followed by
another YAG sequence (Fig.
4
B). Thus although our transient transfection experiments suggest that there are
significant cross-species differences in sequences regulating gene expression in the two
African trypanosomes, there may be conservation of intergenic region signals
directing polyadenylation between the two species.
Although we were unable to obtain and assay CAT activity from transiently
transfected
T.congolense
, CAT transcripts were readily detectable. Either CAT enzyme is highly unstable
in
T.congolense
cells, inactive in extracts, or the CAT RNA is not able to be translated. In
the constructs we used initially, only the GARP 3 UTR was used to specify
polyadenylation. CAT transcripts produced were polyadenylated but not at the
wild-type site. It is possible that aberrantly polyadenylated transcripts could
be inefficiently translated leading to undetectable levels of CAT enzyme
activity.
We did not know whether
T.congolense
followed the pattern for
T.brucei
and
Leishmania,
where a downstream splice acceptor site and pyrimidine-rich sequences downstream of the polyadenylation site are required for
correct polyadenylation of transcripts (although our sequence analysis
suggested this may be true). Therefore, we tested whether inclusion of a
portion of the first intergenic region in the 4.3 GARP locus (Fig.
4
B), including the sequence motifs similar to those necessary for accurate
polyadenylation of procyclin/PARP transcripts in
T.brucei,
would result in our being able to detect CAT activity. However, when we
included in our constructs the GARP 3' UTR and a further 650 nt downstream (the insert in the construct
pCG4.33garp, Fig.
1
) we still obtained no CAT activity (data not shown). CAT expression must be
blocked at translation or downstream in these cells.
Our results indicate that, despite the assumed relatively close species
relationship between
T.congolense
and
T.brucei
, sequences important in regulating expression of the major surface antigen of
the procyclic form in these organisms are rather different. We found that the
T.congolense
GARP promoter was inactive in driving CAT expression in transient transfection
assays in
T.brucei
where CAT constructs contained a GARP splice acceptor site but a PARP 3' UTR. It is possible that in
T.brucei
the GARP promoter is active, but the GARP splice acceptor site inactive,
leading to the production of unstable primary transcripts for CAT. Similarly,
the apparent inactivity of the procyclin/PARP promoter in
T.congolense
could be due to the PARP splice acceptor region not being recognised. However,
a construct with a GARP putative promoter region and a PARP splice acceptor
site gave no CAT RNA or CAT activity in
T.brucei
(data not shown), suggesting that the source of the splice acceptor site used
has no effect in these constructs. The fact that GARP and procyclin/PARP genes
are transcribed by different RNA polymerases may be a more likely cause of
promoter inactivity across species.
CAT transcripts produced in our constructs were not polyadenylated at the
correct site in either the GARP or PARP 3' UTRs, but this is irrelevant in these experiments since the GARP and
PARP promoters gave high levels of CAT RNA with these 3' UTRs in
T.congolense
and
T.brucei
respectively. We observed that the 3' UTR of procyclin/PARP and GARP transcripts are not interchangeable
between species in transient transfection assays. We had noted previously that
there existed a conserved 16 nt sequence motif situated at approximately the
same distance with respect to the polyadenylation site for the GARP and
procyclin/PARP genes (
15
). In
T.brucei
while one study showed that the 16mer was necessary for efficient translation
of PARP/procyclin mRNAs (
18
) another study, using a transient transfection approach, found that CAT
transcripts whose truncated 3' UTR lacked the 16mer seemed to be translated efficiently at least in
procyclic cells (
20
). The 16mer may have a role in GARP gene expression but since the 3'UTRs of both procyclin/PARP and GARP genes are not recognised in the
heterologous system, the conserved motif cannot be sufficient for any
regulation of gene expression exerted by the 3' end of the mRNAs. Finally, we have observed that GARP mRNA is readily
detected in bloodstream form
T.congolense
although the protein is not produced (D. Jefferies, unpublished). This is a
very different situation from that for procyclin/PARP where transcripts are
barely detectable in bloodstream form
T.brucei
. Our observation indicates that, for GARP expression, life cycle stage-specific regulation must be achieved at the translational or post-translational level. This observation may help to explain why GARP
is transcribed by an [alpha]-amanitin sensitive RNA polymerase. RNA polymerase II appears to
transcribe genes whose expression is constitutive in the trypanosome life cycle
as is the case for GARP in
T.congolense
. Procyclin/PARP transcription in regulated during the life cycle, at least
partly (
42
,
43
), metacyclic VSG genes are truly transcriptionally regulated (
44
), while bloodstream VSG gene expression sites are transcriptionally regulated,
especially during the bloodstream phase of the life cycle (
45
). All of these genes which encode major surface antigens are transcribed by RNA
polymerase I (
46
), and this may be a crucial factor in singling out these genes for at least
some degree of transcriptional regulation.
We thank Carole Ross (Centre for Tropical and Veterinary Medicine, Edinburgh)
for provision of trypanosome stocks and cultures and for advice on culturing.
This work was supported by the Wellcome Trust. J.D.B. is a Wellcome Trust
Senior Lecturer.
+
Present address: The Roslin Institute, Roslin, Midlothian EH25 9PT, UK

REFERENCES
Return

