ABSTRACT
In
Paramecium
, as in other ciliates, the transcriptionally active macronucleus is derived from the germline micronucleus by programmed DNA
rearrangements, which include the precise excision of thousands of germline-specific sequences (internal eliminated sequences, IESs). We report the characterization of micronuclear versions
of genes encoding
Paramecium
secretory granule proteins (trichocyst matrix proteins, TMPs) and
Paramecium
centrins. TMP and centrin multigene families, previously studied in the macronuclear genome, consist of genes that are co-expressed to provide mixtures of related polypeptides that co-assemble to form respectively the crystalline trichocyst matrix and the infraciliary lattice, a contractile cytoskeletal
network. We present evidence that TMP and centrin genes identified in the
macronucleus are also present in the micronucleus, ruling out the possibility
that these novel multigene families are generated by somatic rearrangements
during macronuclear development. No IESs were found in TMP genes, however, four
IESs in or near germline centrin genes were characterized. The only intragenic IES is 75 bp in size, interrupts a 29 bp intron and is absent from at least one other
closely related centrin gene. This is the first report of an IES in an intron
in
Paramecium
.
The crystalline contents of
Paramecium
secretory granules and a number of elements of the
Paramecium
cortical cytoskeleton have been studied at the protein level and are known to
assemble from heterogeneous families of immunologically related polypeptides (
1
-
4
). For both the trichocyst matrix proteins (TMPs) and the major polypeptides of
one of the cortical cytoskeletal arrays, the infraciliary lattice (ICL), we
have recently shown that most or all of this heterogeneity is situated at the
level of primary structure: both TMPs and ICL polypeptides are encoded by multigene families (
5
,
6
). TMPs are
Paramecium
-specific proteins (
7
), while the ICL polypeptides are centrins, EF-hand calcium binding proteins that have been highly conserved throughout
eukaryotic evolution (
8
).
The TMP and centrin multigene families, estimated to contain ~100 and ~20 members respectively, share several characteristics. First, they are organized into subfamilies. Each subfamily codes for
a distinct protein, but within each subfamily, several genes code for nearly
identical proteins. All members of these families so far characterized (12 TMP
genes or gene fragments from three different subfamilies; three ICL genes from
the same subfamily) contain short introns (between 23 and 29 bp) (
5
-
7
), the small size being a characteristic of all known
Paramecium
introns (
9
-
11
). The most unusual feature of these multigene families, however, is that all
members seem to be constitutively co-expressed, thus producing mixtures of related polypeptides that co-assemble. We have suggested that the use of multigene families to
assure a microheterogeneity of structural proteins may be a general morphogenetic strategy in
Paramecium
(
5
,
6
).
Paramecium
, like all ciliated protozoa, presents nuclear dimorphism; each cell possesses a somatic nucleus (macronucleus) that is responsible for gene expression during vegetative growth and a transcriptionally inactive germline nucleus (micronucleus) that is
involved in sexual processes (conjugation and autofertilization). The germline micronucleus is diploid and contains conventional chromosomes of
an average size of 2000 kb. The somatic macronucleus has a DNA content ~800 times greater than the micronuclear haploid value, divides amitotically during vegetative growth and consists of small acentric chromosomes (50-800 kb). During sexual processes, the micronucleus undergoes meiosis
followed by fertilization to form the zygotic nucleus. The old macronucleus is
degraded and a new one is formed by extensive, programmed rearrangements of the
DNA of a mitotic copy of the zygotic nucleus (
12
), involving chromosome fragmentation, amplification and
de novo
telomere addition, as well as the precise excision of thousands of germline-specific elements known as internal eliminated sequences (IESs;
13
-
15
).
Before tackling evolutionary questions raised by the TMP and ICL families of co-expressed genes, it is important to see whether all of them are present in
the micronucleus or are somehow produced by somatic rearrangements during macronuclear development. As the only micronuclear sequences that have been characterized so far in
Paramecium
are the coding and flanking regions of surface antigen genes (
16
-
19
), it is also important to characterize other micronuclear genes and, in
particular, genes that contain introns. By screening a
Paramecium tetraurelia
micronuclear library (
16
), we have obtained evidence that the previously characterized macronuclear TMP and ICL genes are indeed present in
the micronucleus. No germline-specific sequences were found in the coding regions of TMP genes. However, an IES was found
in an intron of a germline centrin gene. Furthermore, this IES is absent from
at least one other closely related micronuclear centrin gene.
A library constructed in [lambda]GEM-11 with
Paramecium tetraurelia
(stock 51) micronuclear genomic DNA, partially digested with
Sau
3A, was kindly provided by John R.Preer (Institute of Molecular Biology, Indiana University, Bloomington, IN) (
16
); a library constructed in [lambda]EMBL3 with
Sau
3A-partially digested
Paramecium tetraurelia
(strain d4-2) macronuclear genomic DNA was a gift of Eric Meyer (Laboratoire de Gntique Molculaire, Ecole Normale Suprieure, Paris, France). The libraries were screened using T1 and ICL1
subfamily-specific
32
P-labelled probes (272 and 597 bp respectively), obtained by PCR
amplification in the presence of [[alpha]-
32
P]dATP as described previously (
5
). Selected clones were isolated according to standard techniques (
20
). The positive phage plaques were further characterized by using the first
round phage stocks as substrate for PCR amplification of T1- or ICL1-selected gene regions. For each phage plaque stock, a small aliquot
of the storage medium (SM; 0.1 M NaCl, 10 mM MgSO
4
, 0.01% gelatin, 20 mM Tris-HCl, pH 7.5) was heated at 75oC for 20 min. Aliquots of 0.5 [mu]l (about 1/1000 of the total volume of the stock) were then used
for each PCR sample. The oligonucleotides used to prime these reactions and
their positions with respect to the gene maps, as well as the templates and
primers used to generate the
32
P-labeled probes, are given in Figures 1A and 2A and their legends. The
amplification products were then analyzed by Southern blot hybridization, using
as
32
P-labeled probes either the subfamily-specific gene fragments used for library screening or (for the T1
products) the gene-specific oligonucleotide probes given below. Selected PCR products were
recovered using the QIAquick spin kit (Qiagen), then cloned into the
Sma
I site of plasmid pUC18. Macronuclear and micronuclear inserts containing the ICL1-d gene were identified and characterized by restriction digestion and
Southern blot analysis using the ICL1 subfamily-specific probe;
Eco
RI phage restriction fragments (~4.5, 1.5 and 0.28 kb) were subcloned into plasmid pUC18. DNA sequences were
determined using the T7 sequencing kit (Pharmacia).
PCR reactions (50 [mu]l) contained 200 pmol each primer, 0.1 mM dNTPs and 2 U Taq DNA polymerase
(Boehringer). Reactions were overlayed with 50 [mu]l mineral oil and carried out for 30 cycles of denaturation at 90oC for 30 s, annealing at 48oC for 45 s and extension at 72oC for 90 s, using an OmniGene Temperature Cycler (Hybaid).
DNA was fractionated by electrophoresis through agarose gels and transferred to
Hybond-N
+
filters (Amersham) in 0.4 M NaOH. Hybridizations were carried out according to
Church and Gilbert (
21
) at 60oC (PCR-generated probes). The sequences of the T1 gene-specific probes, corresponding to the third intron of each of
six members of the T1 subfamily, are as follows (
5
):
T1-a, GTATGTATCCCTTGTTAACCCTTTCATAG;
T1-b, GTAACCAATCCTTATTAACTCGCCCTAG;
T1-c, GTAATGTCTAATTGATATATCCTCTAG;
T1-d: GTAATTCTAACCTAATATCCTATAG;
T1-e, GTAATTCTAACAACTTAAATGATAG;
T1-f, GTATTCTATTTTCTCATTCCTAG.
For these oligonucleotide probes hybridization temperatures were between 43 and
52oC, according to the estimated
T
m
of the hybrids. The membranes were then washed at the same temperatures chosen for hybridization with decreasing concentrations of SSC (1* SSC = 150 mM NaCl, 15 mM sodium citrate, pH 7.2) (
20
) in the presence of 0.1% SDS, 2* SSC for 30 min, followed by 0.2* SSC for 20 min. Autoradiograms were obtained by exposing the
filters to Hyperfilm-MP films (Amersham).
Our previous characterization of macronuclear TMP genes involved the determination of complete sequences for three genes (T1-b, T2-c and T4-a) cloned from a macronuclear library (
7
). These three genes, representative of three different subfamilies, code for
proteins that present a common organization but share only ~25% sequence identity. For several other members of each subfamily, the
sequences of gene fragments generated by PCR were established (
5
). Southern blot experiments using exon sequences as probes allowed us to show
that there are between four and eight different genes in each subfamily,
consistent with the number of different sequences found among the PCR products.
The introns in these genes, and in particular six paralogous introns characterized in the T1 genes, were shown to be unique elements in the
Paramecium
genome and to constitute gene-specific probes (
5
; L.Vayssi, unpublished data).
In order to characterize micronuclear TMP genes, a micronuclear library (
16
) was screened by hybridization to a T1 subfamily- specific exon probe (Fig.
1
). Twenty-two positive phage plaques were chosen for further analysis. The first
round stocks, containing the partially purified phage particles, served as
substrates for PCR using partially degenerate primers that amplify ~570 bp of the T1 genes. The 22 PCR products were then hybridized with gene-specific probes consisting of synthetic oligonucleotides that exactly
correspond to the third intron of each of six different T1 genes (
5
). Each gene-specific probe hybridized to at least one amplification product and each amplification product hybridized to no more than one gene-specific probe (Fig.
1
B). This region of the T1 genes does not appear to contain germline-specific sequences, as the amplification products were of the same size as those obtained using macronuclear DNA templates.
The
Paramecium
ICL, a contractile network that constitutes the innermost element of the cortex, is composed of six immunologically related Ca
2+
binding proteins (
2
). Distinct N-terminal microsequences were found for four of the ICL polypeptides and
for one of them, ICL1, macronuclear DNA sequences were obtained (
6
; Fig.
2
A). These sequences, along with data from Southern blot and reverse transcription-polymerase chain reaction (RT-PCR) experiments, revealed that at least three different co-expressed genes, ICL1-a, ICL1-b and ICL1-c, code for nearly identical polypeptides and
that these ICL1 polypeptides are
Paramecium
centrins.
Figure
The four ICL1 genes that have been characterized share 85-95% nucleotide identity and most likely arose through duplication of an
ancestral gene. This is all the more probable as the positions of two introns
are strictly conserved among the ICL1 genes. One of two micronuclear ICL1 genes
characterized contains an IES (in the second intron), while the other does not,
indicating that an IES has been either gained or lost since duplication of the
ancestral ICL1 gene. Another possibility is that the duplication(s) occurred
after developmental excision of the IES, thus fixing a macronuclear version of
the gene in the micronucleus, much as reverse transcription of spliced RNA
molecules can create intron-free genes (
23
).
This situation is similar to that found for
Paramecium
A and B surface antigen genes, which share 70% nucleotide identity and belong to
a family of ~10 genes characterized by mutually exclusive expression (
24
). The 8 kb A gene contains eight IESs within coding and immediate upstream
sequences (
16
), while the B gene contains only four IESs, three and probably all four of
which are in conserved positions with respect to the IESs of the A gene (
18
). It thus appears likely that some or all of these IESs were present in the
ancestral surface antigen gene and were maintained, while the others were lost
or gained subsequent to gene duplication. Although IES position seems to be
conserved, the sequence and size of paralogous IESs are highly variable, in
favour of the hypothesis that only the inverted terminal repeats, which
resemble those of Tc1-related transposons, and a minimal size (see below) are necessary for
developmental excision of
Paramecium
IESs (
22
).
We report here the first example in
Paramecium
of an IES in an intron. In
Tetrahymena
, all known IESs are located in non- transcribed regions of the genome with one exception: a germline- specific sequence (named mse2.9) was found in an intron of a gene of
unknown function (
25
). Since IES excision is imprecise and generates some junction heterogeneity in
Tetrahymena
(
26
), it was suggested that IESs would not be tolerated in
Tetrahymena
coding sequences (
25
). Subsequent analysis of caryonidal clones confirmed that excision of mse2.9
does generate considerable junction microheterogeneity (
27
).
In hypotrich ciliates and in
Paramecium
, IESs are found in both coding and non-coding regions (
13
,
15
). Even if
Paramecium
IES excision is not always precise (an example of boundary microheterogeneity in excision of a
P.primaurelia
IES from a non-coding region has been observed; A.Le Mouel, K.Dubrana and L.Amar,
personal communication), it clearly can be when IESs are situated in coding
sequences. Evaluation of more micronuclear genes is needed in order to better
evaluate IES distribution in coding and non-coding regions of the genome.
The finding of an IES in an intron may provide a clue as to the small size (19-33 bp) of
Paramecium
introns (
9
-
11
). Not only are these introns among the smallest characterized in any organism,
but their size is remarkably homogeneous: no introns of larger size have yet
been found. It is expected that these introns are spliced by a classic nuclear
spliceosome machinery (
28
), since a preliminary statistical analysis of available intron sequences
indicates that they harbour at least some of the signals found in yeast and vertebrate nuclear introns (C.Thermes and Y.Daubenton- Carafa, personal communication). The complexity of the spliceosome and the
number of genes involved in RNA splicing rule out independent evolution of this
process in
Paramecium
. It is, however, striking that the largest introns are ~30 bp in size (and most frequently bordered by GTA...TAG), while IESs (always bounded by TA...TA) tend towards a size of 28 bp, the IES size most
frequently found so far in
Paramecium
, beyond which their developmental excision has been postulated to be
inefficient (
15
,
22
). Although highly speculative, it seems worth considering the possibility that
IESs which have become so small that they escape DNA excision can turn into
introns and be removed by splicing. Such an adaptation to the removal of
defectively small IESs could have driven
Paramecium
introns to their present small size, by exerting selective pressure for optimization and specialization of the splicing reaction on very small substrates. It will be interesting to
perform somatic transformation experiments with the micronuclear ICL1-b gene in order to determine whether the presence of the 75 bp IES within
the second intron will inhibit RNA splicing.
We are particularly grateful to Claire Bertrand for her participation in the present study during the preparation of the Magistre de Biotechnologie.
We also wish to thank Laurence Amar, Mireille Betermier, Eric Meyer and Claude Thermes for useful discussions, Eric Meyer and John Preer for kindly providing us with phage libraries and
Janine Beisson and Jean Cohen for critical reading of the manuscript. This work
was financed by the GREG Genome Project (grant no. 94/70) and the CNRS. L.V.
was supported by a pre-doctoral fellowship from the Ministre de l'Enseignement Suprieur et de la Recherche and L.M. by a `poste rouge' from the CNRS.
*To whom correspondence should be addressed. Tel: +33 1 69 82 43 92; Fax: +33 1
69 82 31 50; Email: madeddu@cgm.cnrs-gif.fr

REFERENCES
Return

