ABSTRACT
In recent years several telomere binding proteins from eukaryotic organisms have
been identified that are able to recognise specifically the duplex telomeric
DNA repeat or the G-rich 3'-ending single strand. In this paper we present experimental evidence that
HeLa nuclear extracts contain a protein that binds with high specificity to the
single-stranded complementary d(CCCTAA)
n
repeat. Electrophoretic mobility shift assays show that the oligonucleotide
d(CCCTAACCCTAACCCTAACCCT) forms a stable complex with this protein in the
presence of up to 1000-fold excesses of single-stranded DNA and RNA competitors, but is prevented from doing so in
the presence of its complementary strand. SDS-PAGE experiments after UV cross-linking of the complex provide an estimate of 50 kDa for the molecular
weight of this protein.
Interest in the functional role as well as the structural aspects of telomeric
DNA repeats in eukaryotic chromosomes has grown very rapidly in recent years.
From the functional point of view, the quest for a deeper understanding of the
role of telomeric DNA was boosted by the discovery of its peculiar replication mechanism through the telomerase enzyme (
1
,
2
) and of the tissue type dependent variation in telomerase activity in normal as
well as tumour tissues (
3
-
5
). From the structural point of view, questions have been raised whether the
unusual secondary structures that each of the complementary telomeric strands is capable of adopting
in vitro
may be involved also in telomere functioning
in vivo.
In particular since the discovery of a number of inter- and intra- molecular structures of short guanine runs, like those in the 3'-ending strand of telomeric DNA, which are based on
planar guanine tetrads arranged in cycle through Hoogsteen hydrogen bonds (
6
-
9
). In addition also C-block repeat sequences, of which the other strand of the telomer repeat is
a representative, have been shown to be able to adopt the self-intercalated tetra-stranded structure i-DNA (
10
,
11
), although a slightly acidic pH value is required in this case.
Along with the intense experimental work to better characterise the telomerase
enzymes both in unicellular organisms, such as ciliates, and in higher
vertebrates, several research groups have directed their efforts to discover
and characterise other nuclear proteins capable of binding specifically to
telomeric DNA repeat sequences. Proteins have been identified that specifically recognise double-stranded telomeric repeats, such as Rap1p in yeast (
12
,
13
), PPT from
Physarum
(
14
) and TRF in mammalian cells (
15
,
16
); as well as proteins that bind to the single-stranded guanine-rich 3'-ending motif, such as [alpha][beta] protein from
Oxytricha
(
17
), TBP from
Euplotes
(
18
), TEP and TGP from
Tetrahymena
(
19
,
20
), GBP from
Chlamydomonas
(
21
), XTEF from
Xenopus
(
22
), an avian factor (
23
), MyoD from mammalian cells (
24
) and A2/B1 from HeLa (
25
). Some of these have been shown to bind to strands arranged in the G-DNA tetradic structure (
20
,
24
) or even to promote its formation
in vitro
(
12
,
13
).
In recent years evidence has accumulated which indicates the presence in HeLa
cells of a specific binding activity toward telomeric-type DNA sequences. Ishikawa
et al.
(
25
) have found nuclear proteins able to bind to the single stranded (TTAGGG)
n
motif and identified them as components of the hnRNP complex. Although a role
of these components in telomere functioning can be envisaged, it appears that
the primary reason for this specific binding stems from the rather strict
similarity of the telomeric repeat sequence with the 3' splice site consensus of hnRNA. Similarly TRF, a protein binding specifically to the double-stranded telomeric repeats, has been discovered by Zhong
et al.
(
15
), and subsequently cloned and shown to be physically linked to the telomeres of
metaphase chromosomes (
16
).
We report in this paper experimental evidence of the presence in HeLa nuclear
extracts of a specific binding activity directed to the complementary C-rich telomeric sequence when present as a single strand.
The HeLa nuclear protein extract was obtained according to the method of Dignam
(
26
), and the concentration of the stock solution was determined by the Lowry
assay.
The following oligonucleotides were synthesized by an Applied Biosystem apparatus through phosphoramidite chemistry and purified according to
standard methods: d(CCCTAACCCTAACCCTAACCCT) (HTC4); d(AGGGTTAGGGTTAGGGTTAGGG) (HTG4); d(AACCCCTGCATTGAACTCCA) (RND); d(CTTTCTTCCCTTCCTTTC) (PYR).
The oligonucleotides used for electrophoretic experiments were radiolabeled with
[[gamma]-
32
P]ATP and T4 polynucleotide kinase according to standard procedures (
27
): generally 2-4 pmol of each DNA strand were labeled with 10 [mu]Ci of high activity ATP.
Aliquots of 0.1 pmol of each oligonucleotide were incubated in 10 [mu]l volumes (10 mM Tris buffer, pH 8, 50 mM KCl, 0.1 mM EDTA) with 4 [mu]g of nuclear proteins at 5 or 25oC for the prescribed time, alone or in the presence of an excess of
cold complementary strand or competitor DNA (and RNA). In the case of the cold
complementary strand, the duplex was allowed to anneal for several hours at
room temperature prior to its addition to the incubation mixture. In the case
of competitors, the radiolabeled telomeric strand was added last in the
incubation mixture, allowing time for possible interactions between competitor and protein extract to occur. After incubation, the samples were loaded onto a non denaturing polyacrylamide gel [10% acrylamide (bisacrylamide 1:30); same buffer as for incubation] and run at 10 V/cm at about 15oC (running time 75 min, corresponding to ~5 cm migration for the unbound oligonucleotide). Finally the gel was
dried and autoradiographed.
Incubation mixtures, in parallel with those used in gel retardation assays, were
prepared and irradiated with a UV lamp (50 W) for 60 min. Then the samples were loaded onto a standard SDS-PAGE and run. The samples were then run on a standard SDS-PAGE with molecular weight
markers (HMW calibration kit, Pharmacia), which were subsequently submitted to blue Coomassie staining, to allow an estimate of the molecular weights of the cross-linked complexes. After the run, the gels were dried and autoradiographed.
In search of a protein within HeLa nuclear extract capable of specifically recognizing the single-stranded telomeric repeat, we have conducted a number of electrophoretic mobility shift experiments.
Figure
1
A shows that, after incubation of labeled HTC4 with the protein extract, two
well resolved retarded bands are observed (lane 4), whose mobilities are
slightly higher and slightly lower than that of the band observed in the case
of HTG4 (lane 2), respectively. This one very likely corresponds to the complex
already observed by Ishikawa
et al.
(
25
). Incubation of labeled HTG4 with an excess of its complementary HTC4, that converts the radiolabeled probe in the standard double strand DNA, before the addition of the protein extract, substantially suppresses every
retarded band: no double-stranded DNA binding protein is detected in this experiment (lane 3).
In a typical gel retardation assay about 0.1-0.2 pmol of labeled HTC4, incubated with 4 [mu]g of the HeLa protein extract, were run in each lane. Since at least
half of the radioactivity is present in the retarded bands, the amount of bound
protein can be estimated to be of the same order of magnitude of the
oligonucleotide. With the molecular weight estimates from the UV cross-linking experiments and assuming a 1:1 stoichiometry for the DNA-protein complex, this means a few ng of bound proteins out of 4 [mu]g, or 0.1% of the nuclear extract, which however does not
contain histones. Less than half of this amount is attributable to the faster
component, i. e. that not displaced even by a 1000-fold denatured
E.coli
DNA. If our nuclear extract contains some 10% of the total nuclear proteins,
0.05% of this amount corresponds roughly to few femtograms of protein per cell,
not much higher than the amount of telomeric DNA (
28
).
From the UV-cross-linking experiments of Figure
4
a much higher amount of radioactivity is found in the HTG4-protein complexes than in the HTC4-protein complexes. This observation could suggest that the faster component that binds HTC4 is at
least one order of magnitude less abundant than those recognising HTG4 [i.e.
the hnRNPs observed by Ishikawa
et al.
(
25
)], if the efficiency of UV cross-linking were the same for the two systems. However this assumption cannot
be taken for granted.
As to the specificity of the interaction, some semiquantitative considerations
can be based on the
E.coli
DNA competition experiments. The vertebrate telomeric motif TTAGGG and its
complementary are not overrepresented in the
E.coli
genome: a brief search for the presence of the representative nonamers
GGGTTAGGG and CCCTAACCC in the
E.coli
sequences in GenBank has found these strings at a frequency very near to that
statistically expected, i.e. <1 every 100 000 nt. If the hnRNP components binding to HTG4 owe this
activity to the fact that its repeat conforms to the 3' splicing consensus, ...YAG/G..., the statistical expectancy of this
consensus in
E.coli
DNA could be roughly estimated to be 2 in every 2444 nt, i.e. 1 every 50-100 nt. This agrees with the observation that a 100-fold and a 1000-fold excess of denatured
E.coli
DNA approximately remove half and 90% of the protein bound radioactivity,
respectively (lanes 6 and 7 of Fig.
4
). Almost the same argument holds for the slower component of the HTC4 binding
proteins: its limited specificity for HTC4 is likely fortuitous. In contrast,
the faster component is not competed even by a 1000-fold excess of
E.coli
DNA, suggesting that its specificity should be considerably higher than that
for a full, and exact, telomeric repeat. Indeed the statistical expectancy for
the hexameric CCCTAA sequence, or for each of its cyclic permutations, can be
estimated to be 2 every 4
6
nt, or ~1 every 2000 nt: if one hexameric stretch was sufficient for full recognition by the protein, a 1000-fold excess of
E.coli
DNA would have produced a remarkable competing effect.
It has been shown that (CCCTAA)
4
, in analogy to similar C-block repeats, is able to adopt an unusual secondary structure, the i-DNA (
10
,
11
) at pH values of <= 6. The specific complex of HTC4 with the nuclear protein described here
forms at pH values of 7-8, at which HTC4 does not adopt the i-DNA structure by itself, apart from the very tiny amounts expected
on the basis of the pH-dependent equilibrium between the structured and unstructured forms. It is
very likely that this protein simply binds the unstructured strand and i-DNA is not involved. However at the moment one cannot rule out the
possibility that the protein binds the i-DNA structure, paying for the free energy cost of stabilising it at
neutral or slightly basic pH through the interaction.
Clearly the mobilities of the shifted bands in the gel retardation experiments cannot provide meaningful information about the native forms of
these DNA binding proteins, and the SDS-PAGEs of the cross-linked complexes provide only approximate information on the molecular weights at the subunit level. The fact that two bands
are found in the mobility shift assays with the faster being much more sequence-specific, and that the same pattern is observed in the UV-cross-linking experiments, suggests that two different proteins capable of binding HTC4 exist in
nondenaturing conditions. If the slower shifted band were due to the binding of two units of
the same protein to one HTC4 molecule,
E.coli
DNA would not compete with it, as it does not compete with the faster band.
In conclusion, it can be argued that, if the protein responsible for the slow retarded band exhibits a rather moderate and probably fortuitous specificity in its interaction with the CCCTAA motif, the much
higher specificity of the faster component indicates it as a promising
candidate to join the already identified telomere binding proteins. Work is in progress to isolate and characterise this protein.
This work has been supported by grants from the Italian MURST and from the Italian CNR (National Research Council).
Escherichia coli
DNA sequence analysis has been performed thanks to Intelligenetics software and
the GenBank.
REFERENCES
Return
