Analysis of co-crystal structures to identify the stereochemical determinants of the
orientation of TBP on the TATA box
Analysis of co-crystal structures to identify the stereochemical determinants of the orientation of TBP on the TATA box
Masashi
Suzuki
,
Mark D.
Allen
,
Naoto
Yagi
1
and
John T.
Finch
2,
*
AIST-NIBHT Structural Biology Centre, Higashi 1-1,
Tsukuba
305,
Japan
,
1
Tohoku University, School of Medicine, Seiryo-machi,
Sendai
980-77,
Japan
and
2
MRC Laboratory of Molecular Biology, Hills Road,
Cambridge
CB2 2QH,
UK
Received February 27, 1996;
Revised and Accepted May 23, 1996
ABSTRACT
Possible stereochemical determinants of the orientation of TBP on the TATA box
are discussed using the crystal coordinates of TBP-TATA complexes, which have been determined by other groups. The C-terminal half of the TBP
[beta]
-sheet interacts with the TATA site of the DNA, and the N-terminal half with the A-rich site, so that the two sites with distinct curvatures
produce a unique fit. Although chemical contacts take place between one side of
the
[beta]
-sheet and the DNA minor groove, the interaction seems to be facilitated
indirectly by the characteristics of the other side of the
[beta]
-sheet and the DNA major groove. Thus, Ala71, Leu162 and Pro190
differentiate the curvature of the
[beta]-sheet in the N- and C-halves. The methyl positions in the DNA major groove modulate
the bendability of the two DNA sites by using differences in the rolling
capacity of TA and AT compared with PyT, and in the shifting capacity of AT
compared with TT. The deformations of the first steps (TA and PyT) in the two
sites are the largest and thus are important for the overall bending of the
DNA. The differences between the two DNA sites are greatest at the second steps
(AT and TT) and so these are important for determining the orientation of TBP.
INTRODUCTION
In most eukaryotic systems of protein transcription, the TATA box is positioned
upstream of the transcription initiation site (
1
-
9
). Half of its sequence, 5'-TATA-3', is conserved well (referred to as the TATA site in
this paper), while the other half is less so but generally has many adenine
bases (referred to as the A-rich site).
Upon initiating transcription, the TATA box-binding protein, TBP, binds to the DNA, contacting the TATA site by its C-terminal domain and the A-rich site by its N-terminal domain. This polarity determines the direction
of transcription through further interaction between TBP and RNA polymerase II.
It is possible that some other proteins which interact with TBP might help for
fixing the orientation, but it is more likely that TBP by itself is able to do
so, since the orientation is kept in the same way in the two TBP-DNA complexes co-crystallised in the absence of such a protein [the two structures
are referred to as the Yale (
10
) and Rockefeller (
11
) structures].
It has been noticed that van der Waals contacts are important for TBP-TATA interaction (
10
,
11
), and it has been suggested that possible differences in the flexibility of two
DNA sites might be important for fixing the orientation (
10
-
12
). However, no stereochemical characteristic of the DNA or TBP involved in such
a mechanism has been specified. On the contrary, the structures have provoked a
number of puzzling questions.
The N- and C-domains of TBP have the same composition of secondary structural
elements and very similar three dimensional structures, and thus it is not
immediately obvious why the two domains can choose different partner sites. In
fact, the chemical contacts between the TBP domains and the DNA sites are quite
similar in the two sets. In addition, the TATA and A-rich sites are moulded into similar structures, while the two sequences
are expected to behave in very different ways (
12
,
14
-
16
).
In this paper to answer the above questions we re-analyse the co-crystal structures using their atomic coordinates. The first step is
to examine whether the two halves of the complexes are indeed as similar as
they appear.
COMPARISON OF THE TWO HALF STRUCTURES
Each domain of TBP is composed of two [alpha]-helices and five [beta]-strands. The ten [beta]-strands of the two domains fold into a
single [beta]-sheet, and eight [beta]-strands among the ten fit around the minor groove of the
DNA closely (Fig.
1
a).
SOME OTHER CHARACTERISTICS OF THE DNA
The T
3
A
4
(Rockefeller and Yale) and T
3
A
4
(Yale) steps show positive high sliding. A gap is found between two lines of
the hydrophobic residues (circled in Fig.
1
c) which contact the C2'H atoms of A bases-i.e. Val(aa39, aa119), Val(aa80, aa171), and Leu(aa72, aa163). To
cross over this gap the positive sliding of the dinucleotide steps seem to be
used.
In general, TA is the only step which can slide to a large extent in the
positive direction among the A/T-rich sequences (
20
; see also Fig.
5
d, e and f to understand how positive sliding would clash the methyl group
against the nearby sugar-phosphate backbone at TT and AT). This explains the high sliding found
for the TA steps. Smaller sliding of T
3
T
4
is discussed later in this section.
Figure 5
.
Rolling of a step and propeller twisting of the base pair. (
a
,
c
and
d
) Positive rolling of a dinucleotide step (a) is followed by positive (c) or
negative (d) propeller twisting depending on which the DNA strand the T base is
to avoid the methyl group approaching the neighbouring base pair. (
b
) Propeller twist angles of base pairs calculated for the Yale (Y) and
Rockefeller (R) DNA structures.
An A[middot]T base pair has a tendency to show high propeller twisting (
27
). At the two highest positively rolled steps, T
1
A
2
and (T/C)
1
T
2
, the A
2
[middot]T
7
base pair has high positive propeller twisting, and T
7
[middot]A
2
has negative propeller twisting (Fig.
5
c). Thus independent of whichever strand the T base is on, an A[middot]T base pair propeller twists so that the T base moves away from the
nearby base pair on the major groove (Fig.
5
c and d). As a consequence, the partner A base becomes closer to the nearby base
pair but this is less problematic as the A base is slimmer.
The nucleotide sequences of the TATA box which are found in the real eukaryotic
transcription systems (
28
) and those which are strongly bound by TBP
in vitro
(
13
) are similar as a whole, except for position 5, which is occupied frequently by
an A base
in vivo
but by a T base
in vitro
. Such a discrepancy might not be so surprising, since the
in vitro
sequence is important only for binding but the
in vivo
sequence is important also for fixing the orientation of TBP. Thus a symmetric
and flexible sequence, TATATATA, is a good
in vitro
binding site but is found less frequently
in vivo
.
The DNA sequence in the Yale structure is of the
in vitro
type, TATA
T
AA(G/A), while that in the Rockefeller structure is of the
in vivo
type, TATA
A
AA(G/A). As a whole, because of the more symmetric nature of the nucleotide
sequence, the Yale DNA structure is more symmetric than the Rockefeller DNA
structure (compare the roll angle and the slide distances at T
5
A
6
and A
5
A
6
in Fig.
3
). Thus, for transcriptional regulation the less symmetric Rockefeller sequence
seems to be preferable.
The A
4
T
5
and T
5
A
6
steps in the Yale structure have conformations distinctly different from each
other-the former mainly rotates, while the latter slides (Fig.
3
). However, the A
4
A
5
and A
5
A
6
steps in the Rockefeller structure behave in similar ways, probably up to the
limit of the freedom in movement by an AA step. The above arguments can explain
why the TATATAA(G/A) sequence is a better binding site but is less frequently
used
in vivo
.
CONCLUSION
The minor groove side of an A[middot]T base pair is smoother than that of G[middot]C (note N2'H
2
of G). Thus, for making van der Waals contacts with TBP an A/T-rich sequence is appropriate. For fixing the right orientation of TBP, the
two halves of the TATA box are differentiated, one half to the flexible TATA
sequence, and the other to a less flexible A-rich sequence by arranging T bases differently. Positioning of the methyl
groups on the inner surface of the complex is correlated with the positioning
of large/small amino acid residues on the outer surface through the
bendability/curvature of the two molecules. Many characteristics of the TBP-binding sites can be explained consistently by focusing attention on
steric hindrance of the methyl groups of the T bases.
ACKNOWLEDGEMENTS
We thank Dr C. Chothia for stimulative discussion on the curvature of [beta]-sheet. We thank Ms M. Iimura for her help in preparing figures. The
coordinates of the Rockefeller structure were kindly provided by Prof. Burley,
and we thank also the Yale group whose coordinates are already deposited to
Protein Data Bank (29), PDB code 1YTB.
REFERENCES
1 Van Dyke,M.W., Roeder,R.G. and Sawadogo,M. (1988) Science, 241, 1335-1338.MEDLINE Abstract