ABSTRACT
The nucleocapsid protein (NC) of HIV-1 is a small zinc finger protein that contributes to multiple steps of the
viral life cycle, including the proper encapsidation of HIV RNA. This is
accomplished through an interaction between NC and a region at the 5
'
-end of the RNA, defined as the Psi element. However, the specificity of NC
for Psi or for RNA in general is not well understood. To study this problem, we
used SELEX to identify high affinity RNA ligands that bind to NC. A `winner'
molecule (SelPsi), as well as a subregion of Psi RNA, were further
characterized to understand the interaction between NC and SelPsi and its
relationship to the interaction between NC and Psi. The comparison makes
predictions about the sequence and structure of a high affinity binding site
within the HIV-1 Psi element.
Nucleocapsid protein (NC) is involved in encapsidation and other steps in the
viral life cycle of HIV. It is a proteolytic fragment of the
gag
gene product, contains two zinc finger motifs (Cys-X
2
-Cys-X
4
-His-X
4
-Cys) and is rich in arginines and lysines. Proteins closely related to NC
are found in all retroviruses. The NC domain of Gag plays a role in both
encapsidation and dimerization of the viral RNA (
1
-
4
).
In vitro
, NC has been shown to participate in both annealing of tRNA
Lys
to the primer binding site and synthesis of pro-viral DNA (
5
-
7
). NC has also been shown to be a potent chaperone for general nucleic acid
folding and unfolding (
8
-
10
); this activity is almost certainly important for the many roles NC plays in
the HIV life cycle.
The encapsidation and dimerization signals are closely linked and are present at
the 5'-end of the HIV RNA (
3
). The RNA sequence important for encapsidation is referred to as the Psi
element. NC binding to Psi is dependent on a functional N-terminal zinc finger element (
11
). As expected, mutations or deletions within Psi or NC have a severe effect on
the ability of the virus to encapsidate viral RNA; viral particles are still
formed but little or no viral RNA is present (
2
,
12
). In addition, mutations within Psi have been shown to reduce the efficiency of
dimerization
in vitro
(
3
,
13
-
15
). The close proximity of the encapsidation and dimerization signals suggests
that the two processes are linked, but this remains to be clarified.
In one model, the secondary structure of Psi has been shown to contain four stem-loops (Fig.
2
b;
15
,
16
). In many different studies, all four stem-loops have been proposed to be important either for encapsidation or more
directly as NC binding sites (
2
,
12
,
15
-
17
). The first stem-loop has also been shown to be a major determinant for RNA dimerization
and has been proposed to be involved in a `kissing' loop model (
14
). Stem-loops 3 and 4 are of particular interest, because this region is
downstream of the major 5'-splice site and would contribute to proper selection of unspliced
RNA rather than spliced RNA for packaging. NC coats the viral RNA and ~2000 NC molecules are associated with the dimer packaging substrate (
18
). The binding site for an individual NC molecule is 8 nt (
19
) and the binding of NC is cooperative (
9
,
20
).
How unspliced viral RNA is selected for packaging is not fully understood.
Although NC is a non-specific nucleic acid binding protein, several groups have shown that NC
or Gag specifically interacts with Psi RNA (
11
,
15
,
17
,
20
-
22
). However, this specificity is modest, as the observed difference in affinity
was only between 10- and 100-fold when antisense Psi RNA was used as the negative control (
11
,
15
). The highest affinity binding has been localized to ~200 nt within Psi (
15
), but further deletions decrease NC binding substantially, making it difficult
to define a smaller high affinity site. Three of the stem-loops (1, 3 and 4; see Fig.
2
b) within the ~200 nt fragment were individually tested for binding by Clever and Parslow;
all three bind to NC with a 4-fold reduction in affinity compared with the 200 nt intact Psi fragment (
15
). But this would appear to pose a specificity problem, because many other RNAs
bind to NC with only a 2-fold further reduction in affinity compared with these three stem-loops (
15
).
To explore NC specificity further, we performed a SELEX experiment (
23
,
24
) and characterized the interaction of NC with the winner RNA, SelPsi. This was
done by structure probing, footprinting and binding of both SelPsi and a mutant
version of SelPsi. In addition, parallel binding and footprinting studies were
carried out on RNAs derived from the HIV-1 Psi element. The data point to a high affinity binding site within the
HIV-1 Psi region. This is due in part to the resemblance of SelPsi (Fig.
2
a) to the Psi stem-loop 3 region, including the single-strand region 5' of the stem (Fig.
7
).
The clone for NC was a kind gift from Zenta Tsuchihashi and Patrick Brown and
the protein was purified as reported (
8
). This clone comprises the first 71 amino acids of the
gag
gene product plus six histidines and an extra methionine at the N-terminus for nickel column purification.
The protocol followed for the SELEX experiment is similar to that published (
23
). A random pool of RNA was generated with the following sequence
GGAGACAGUCCGAGC(N)
40
GGGUCAAUGCGUCAUAGGAUCCCGC. This was done by PCR using the three oligonucleotides
GCGGAATTCTAATACTACTCACTATAGGAGACAGTCCGAGCC (5' oligonucleotide containing an
Eco
RI site and the T7 promoter), GGAGACAGTC- CGAGCC(N)
40
GGGTCAATGCGTCATA (template for PCR reaction containing 40 randomized
nucleotides) and GCGGGATCCTATGACGCATTGACCC (3' oligonucleotide containing a
Bam
HI site). The template generated from the PCR reaction was used for
in vitro
transcription as described (
25
). The RNA was purified on a 20% polyacrylamide gel, ethanol precipitated, run
through a BioRad P6 spin column, heated at 95oC for 2 min and then put on ice (annealing reaction). In the first round of
in vitro
selection, NC was at a final concentration of 50 nM and the pool 0 RNA was at a
final concentration of 500 nM. After every other round the concentrations of
both the RNA and NC were lowered until in the final round of selection (ninth
round) NC was at 1.25 nM and pool 8 RNA was at 100 nM. The concentration of
tRNA present in the first six rounds of selection was 400 nM; for the last
three rounds the tRNA concentration was raised to 600 nM. The binding buffer
consisted of 50 mM Tris, pH 7.5, 100 mM NaCl, 1 mM MgCl
2
, 30 [mu]M ZnCl and 10 mM DTT. Pool 0 RNA was pre-filtered through a 0.45 [mu]m nitrocellulose centrifugal filter unit (Schleicher & Schuell) to select against RNAs that bound selectively to the
filter. This pre-filtering was also done every other round. The RNA and NC were incubated
at room temperature for 20 min in binding buffer. The filter unit was pre-wetted with H
2
O; the RNA-NC was then filtered, followed by a wash with 1 ml binding buffer. The
bound RNA was eluted from the filter using 7 M urea and heated for 3 min at 95oC. After phenol extraction and ethanol precipitation, half of the RNA was
used for reverse transcription. Reverse transcription was done at 45oC for 30 min using MMLV. After the ninth round of selection PCR products
were digested with
Bam
HI and
Eco
RI and then inserted into pSP64 (Promega). Sequencing was done using the fmoltm DNA sequencing system (Promega).
All RNAs except Psi 27 were made by
in vitro
transcription from PCR products using [[alpha]-
32
P]UTP to internally label the RNAs. Psi 27 was made using Perseptive Expedite
RNA amidites on an Expedite 8909 Oligonucleotide Synthesizer and then kinased
with [[gamma]-
32
P]ATP (
26
). RNAs were gel purified after labeling and treated in a similar manner as the
selection pools of RNA. The approximate concentration of RNA in a filter
binding experiment was 50 pM. NC was serially diluted in 1* filter binding buffer; this was done for all dilutions of NC. RNAs were
annealed as in the selection experiment. Radioactive RNA, NC and tRNA (50 nM)
were incubated in 1* binding buffer for 30 min on ice and then filtered at room temperature
through a 0.45 [mu]m nitrocellulose filter (Schleicher & Schuell). Radioactivity was quantitated using a BioRad Molecular Imager.
The retention of free RNA (always <5%) was subtracted from all data points. Different RNA-NC complexes were retained with different efficiencies on the
nitrocellulose: for the RNAs studied here the efficiency of retention ranged
from 40 to 90%. This has been observed with other RNA-protein complexes in filter binding studies (
27
and references therein). The smaller structured RNAs had the lower efficiencies
of retention; for example SelPsi was retained at 40%. The fraction bound was
taken from the plateau of binding and the plateau was assigned as 1.0. Plots
were made with the program Microcal Origin (Microcal Software Inc.).
SelPsi and Psi 76 were treated with calf intestine alkaline phosphatase for 20
min at 37oC (
26
), phenol extracted and ethanol precipitated with glycogen as carrier. Both RNAs
were kinased with [[gamma]-
32
P]ATP and purified in the same manner as the RNAs for the filter binding
experiments. Before RNase digestion, RNAs were incubated at 65oC in SSC buffer (17.5 mM sodium citrate, pH 7.0, 150 mM sodium chloride)
for 5 min then placed on ice for a few minutes; they were then incubated at
room temperature for 10 min in the presence of NC in 1* binding buffer. RNase A (5 Prime -> 3 Prime Inc.), RNase T1 (Ambion) or RNase V1 (Pharmacia) was added
to the mix, which was then incubated at 37oC for 5 min. RNase T1 is specific for single-stranded guanosines, RNase A for single-stranded pyrimidines and RNase V1 for double-stranded regions. To stop the reaction an equal volume of
phenol was added. After precipitation with ethanol, RNAs were resuspended in
formamide dye and separated on 12.5% denaturing gels.
A randomized region of 40 nt was used with the expectation that a region of this
size plus flanking sequences would be sufficient for selection of high affinity
RNA ligands for NC. The flanking sequences, used for reverse transcription and
PCR, were designed to avoid self-complementarity and to minimize primer dimer formation as well as
secondary structures that might bias the selection. The RNA was heated at 95oC for 2 min and then placed immediately on ice, to favor intramolecular
structures. In addition, relatively low concentrations of NC (50-1.25 nM) were used for the selection and the RNA was always in large
excess over NC. These steps were taken to favor the interaction of single NC
molecules with single RNA molecules.
The results from the SELEX experiment indicate that NC favors stretches of
guanosines and uridines. After nine rounds of selection, the pool 8 RNA bound
to NC with ~50-fold higher affinity than pool 0 RNA (data not shown). Twenty three
clones were sequenced from pool 8. Nine were duplicates, giving a total of 14
different sequences. Almost all of these sequences contain long stretches of
guanosines and uridines (Fig.
1
). Two clones, 1 and 9, contain a conserved stretch of 14 nt that are guanosine
and uridine rich. (In Figure
1
this conserved sequence is in bold and underlined.)
Twelve of the 14 clones were tested individually for NC binding. Of these, 1 and 9 bound
to NC with an affinity that is significantly better than that of the rest of
the clones (data not shown). Both clones are predicted to fold into secondary
structures that contain the conserved 14 nt at the end of a long stem-loop. Clone 9 was easily reduced to a smaller RNA, SelPsi (Fig.
2
a), that contains only 40 nt and retains high affinity NC binding. Reducing the
size of the RNA resulted in a few changes to the sequence compared with the
initial clone; the stem was shortened and changes were made in the stem to
optimize both transcription and stability of SelPsi. In addition, a single
adenosine after the conserved region was changed to uridine. SelPsi binds to NC
with an apparent
K
d
of 2.3 nM, which is 200-fold stronger than pool 0 RNA (Fig.
3
).
The footprinting data explain certain aspects of the NC-SelPsi interaction, but they do not clarify why SelPsi binds to NC with
higher affinity compared with other RNAs (Table
1
). The SelPsi-NC apparent
K
d
of 2.3 nM is 25-fold stronger than the affinity of NC for HIV Psi. Although NC favors stem-loops (stem-loops 1, 3 and 4 from HIV Psi;
15
), SelPsi binds with much higher affinity (100-fold) than the individual stem-loops isolated from the HIV-1 Psi element. A partial explanation might be the unusual
structure of the upper half. To test this prediction, we made a mutation that
should both stabilize and lengthen the short upper stem proposed to exist
within the upper half of SelPsi (Fig.
2
a). This mutant RNA, SelPsi 46 (Fig.
2
a), binds to NC with a 400-fold reduced affinity compared with SelPsi (Fig.
3
). In fact, SelPsi 46 binds to NC 2-fold more weakly than pool 0. There are a number of possibilities for how
this mutation reduces the binding affinity in such a dramatic fashion (see
Discussion), but the simple conclusion is that strong NC-SelPsi binding is a function of detailed features of SelPsi sequence and
structure.
To better understand how NC interacts with HIV-1 Psi, different RNAs derived from this region were also used for binding
and footprinting studies. The HIV-1 Psi region encompassing nt 632-837 (fragment B from Clever and Parslow) was used for the filter
binding experiment to obtain an apparent
K
d
of 58 nM, identical to the value obtained by
Clever and Parslow for this RNA (
15
). This fragment has been shown to bind NC with similar affinity compared with
any other larger region of HIV RNA and it is therefore assumed to contain at
least one high affinity NC binding site (
15
). A smaller RNA more amenable to footprinting and mutational studies was
designed; Psi 76 (Fig.
2
b) contains the region including stem-loops 2, 3 and 4 from the Psi region. We choose this region of Psi
because two different studies have shown it to be important for packaging (
2
,
16
). Psi 76 binds NC with an apparent
K
d
of 138 nM, a slightly greater than 2-fold reduction in affinity compared with the larger Psi region (Fig.
5
). Psi 76 has five point mutations compared with HIV-1 RNA (Fig.
2
b). These point mutations allow better
in vitro
transcription and also eliminate a possible dimerization signal in the single-stranded region between stem-loops 2 and 3 (
17
).
Figure
Footprinting studies in the absence of NC are in good agreement with the
predicted secondary structure (
15
,
16
) and also agree with the structure probing results of Clever and Parslow for
this region of Psi (
15
). In the presence of NC, all of the single-stranded regions are protected, whereas parts of the double-stranded regions remain sensitive to RNase V1 (Fig.
6
b). Interestingly, only in the presence of NC does RNase V1 cleave in stem-loop 4. In the absence of NC, the nucleotides proposed to base pair in
stem-loop 4 are partially cleaved by RNase T1, implying that stem-loop 4 is not fully stable (Fig.
6
b); this was also observed by Clever and Parslow (
15
). When NC binds to Psi 76, the observed increase in stem-loop 4 cleavage by RNase V1 indicates that it is stabilized in the
presence of NC. In summary, it appears that several NC molecules interact with
Psi 76 but that the overall secondary structure is only minimally altered.
Figure
Figure
All three stem-loops within Psi 76 are purine rich and SelPsi contains two guanosines at
the end of its loop; we assume that these purines, in particular the
guanosines, are important for NC-Psi 76 binding. To test this hypothesis, we changed all three loops of
Psi 76 into stable UUCG tetraloops (
29
; Fig.
2
c). This change results in a nearly 3-fold decrease in binding compared with Psi 76 (Fig.
5
). The structure of this RNA remained consistent with that of Psi 76, as shown
by structure probing (data not shown). The decreased affinity confirms that the
sequence of the loop nucleotides is involved in the higher affinity interaction
between NC and Psi 76.
In an attempt to isolate a single higher affinity site from Psi, we synthesized
a 27 nt RNA containing the sequence of Psi stem-loop 3 (Fig.
2
c; Psi 27). Psi 27 binds to NC with an apparent
K
d
of 1450 nM, the lowest affinity RNA in this study. A likely reason is that Psi
27 has a strong stem and is also missing the adjacent single-stranded regions present within the HIV Psi element. Indeed, the construct
originally used to observe higher affinity stem-loop 3 binding (
K
d
200 nM) contained the single-strand regions on either side of stem-loop 3 (
15
). Either these single-stranded regions are important for higher affinity binding or the longer
stem inhibits strong binding to Psi 27. Taken together, these observations and
others (
15
) suggest that NC not only recognizes the sequence within loops but also
utilizes the adjacent single-stranded regions to optimize interactions with RNA.
This SELEX experiment is the first performed with NC and supplies important
information about the NC-RNA interaction. The selected high affinity RNA ligand also provides a
lead candidate for inhibitor studies against HIV replication, because NC binds
to SelPsi with 25-fold higher affinity than HIV Psi RNA (Table
1
). High affinity RNA ligands for the HIV proteins Rev and Tat have been
previously shown to inhibit HIV replication (
30
,
31
).
This and other work demonstrates that NC has a higher affinity for the HIV Psi
RNA than for other RNAs (
2
,
12
,
15
-
17
). Although the reason for this higher affinity was unclear, the SELEX results
suggest that NC favors stretches of guanosines and uridines. In addition,
SelPsi forms a secondary structure that presents these favored nucleotides at
the end of a long and stable stem-loop structure (Fig.
2
a). SelPsi is, however, more than a simple stem-loop; it contains a stable lower stem, a bulge region, a short unstable
stem and a 3 nt loop. The results from the footprinting experiments indicate
that the stable lower stem is probably not in direct contact with NC. However,
it is still involved in high affinity binding, because its absence decreases
the binding affinity >100-fold (data not shown). The lower stem is likely involved in positioning or
holding the upper half of SelPsi in the correct structure.
The footprinting indicates that the upper half of SelPsi is only modestly
structured. Indeed, the addition of base pairs to its short unstable stem
(SelPsi 46) decreased NC binding by 400-fold. This result indicates that either its stability or length is critical for high
affinity binding. We favor the latter possibility, because changing two G:U
base pairs to G:C (U17 and U21) within SelPsi did not change the NC binding
affinity appreciably (data not shown). We suggest that an appropriate stem
length serves to orient the relative positions of the loop and bulge so that NC
can bridge the stem and interact optimally with both single-stranded elements (Fig.
7
).
An issue that complicates interpretation of the NC-SelPsi interaction is that RNA dimerization appears to be linked to
encapsidation (
1
,
3
,
32
). It is therefore not certain whether NC recognizes an RNA monomer or dimer for
packaging. This issue was only indirectly addressed in these studies, as
neither SelPsi nor Psi 76 appears to dimerize, as assayed by native gel
electrophoresis (data not shown). Although NC may recognize the stem-loop 3 region or other high affinity sites in a dimer configuration, the
simplest interpretation of our results is that NC initially binds to an RNA
monomer and then may subsequently contribute to the dimerization process.
The upper half of SelPsi is very similar to the stem-loop 3 region within the HIV Psi element (Fig.
7
). Both SelPsi and this region have short stems bracketed by two guanosines in
the loop, as well as UYUUG (Y = pyrimidine) sequences on the 5'-side of the stem (Fig.
7
). Based on the SELEX results and on prior encapsidation studies, we propose
that this stem-loop 3 region is a high affinity site for NC within the context of HIV
Psi. The integrity of these three stems is important for encapsidation
efficiency (
16
) and the region encompassing stem-loop 3 is able to work as a packaging signal in a heterologous RNA (
33
). Taken together, the evidence indicates that the stem-loop 3 region is important for correct packaging of HIV RNA and is also a
high affinity NC binding site. Other regions within Psi are also involved in
packaging (
16
), suggesting that there are additional high affinity binding sites.
Although the upper portion of SelPsi and the stem-loop 3 region are similar, there are also differences that may explain
the different binding affinities. The upper half of SelPsi is at the end of a
strong stem; in contrast, the stem-loop 3 region is surrounded by single-stranded RNA. The short stem of SelPsi contains a bulged uridine
and is rather unstable, whereas stem-loop 3 is canonical and stable. Also, the loops have some differences in
sequence and in the number of nucleotides: SelPsi is GGU and stem-loop 3 is GGAG. Despite these differences, the data suggest that it is
the presence of the lower stem in SelPsi that is predominantly responsible for
the 25-fold difference in binding (because deleting the lower stem of SelPsi
destroys the high affinity binding of SelPsi; data not shown).
Why isn't there a comparable high affinity structure within HIV Psi? A strong
and relatively long stem like that present in the lower half of SelPsi might
impede NC polymerization, because the lower stem might inhibit addition of
subsequent NC molecules after the initial high affinity interaction. In
contrast, the single-stranded RNA surrounding stem-loop 3 might be optimized for NC polymerization. The importance of
polymerization would explain: (i) the lower affinity of Psi for single molecule
binding; (ii) why other regions within Psi have been shown to be important for
encapsidation; (iii) the possible presence of additional high affinity binding
sites within Psi. The importance of polymerization also suggests an explanation
for the modest specificity of the NC-Psi interaction compared with other well-studied protein-RNA interactions (see for example
27
), namely high specificity might be incompatible with polymerization. A
contribution of surrounding sequence and structure to polymerization, as well
as only modest affinity and specificity to an initial `high affinity' site, has
also been proposed for Rev, another HIV RNA binding protein (
34
).
Can a factor of 10 in specificity account for selection of the correct RNA for
encapsidation or are other factors involved? One possibility is that Gag has
more specificity than NC, but this appears unlikely (
15
,
21
). Alternatively, subcellular localization of NC and unspliced HIV RNA might
enhance the
in vivo
specificity for encapsidation. Finally, NC polymerization onto HIV RNA may
contribute in an as yet poorly understood way to encapsidation specificity.
This rather poor specificity of NC for Psi is undoubtedly relevant to its other
roles in the viral life cycle, in which it may function as an RNA chaperone. In
other words, high specificity may be incompatible with the catalysis of
multiple annealing and folding reactions between other RNA sequence elements.
Therefore, the multiple roles of NC may require a compromise between
specificity on the one hand and promiscuous binding reactions on the other
hand.
We thank our collegues, especially M.Moore, A.Rein and H.Colot, for critical
reading of the manuscript. The work was supported in part by the National
Institutes of Health to M.R. (GM 23549).



REFERENCES
Return


