Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (118K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (18)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Deng, Q.
Right arrow Articles by Sarai, A
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Deng, Q.
Right arrow Articles by Sarai, A
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 1996 Oxford University Press 766-775

Footnote

Binding site analysis of c-Myb: screening of potential binding sites by using the mutation matrix derived from systematic binding affinity measurements

Binding site analysis of c-Myb: screening of potential binding sites by using the mutation matrix derived from systematic binding affinity measurements Qiao-Lin Deng , Shunsuke Ishii and Akinori Sarai*

Tsukuba Life Science Center, The Institute of Physical and Chemical Research (RIKEN), 3-1-1 Koyadai, Tsukuba , Ibaraki 305, Japan

Received August 31, 1995; Revised and Accepted December 20, 1995

ABSTRACT

The c-Myb oncoprotein is known to bind to multiple sites in the promoters of target genes. We have developed a protocol to screen the binding site of c-Myb by using the systematic binding data derived from measurements of binding affinity for oligonucleotide containing a known Myb-binding site and its complete single mutants. We first applied the method to predict the binding affinity for the known binding sites and compared with available experimental data. The predicted binding sites agree with many putative binding sites of known target promoters. However, there are some binding sites not predicted by the analysis. These sequences deviate from the consensus sequence derived from the binding analyses. In the light of the structure of Myb-DNA complex, these results indicate that different DNA-binding modes may be used by c-Myb to recognize different classes of binding sites. We also screened the sequence database for potential Myb-binding sites, and found sequences of several promoters that have not been identified experimentally but could be the target for c-Myb.

INTRODUCTION

C- myb is a proto-oncogene encoding a nuclear protein (c-Myb) that binds to DNA ( 1 , 2 ). Its analog, v- myb , also encodes a nuclear protein, which is characterized by N- and C-terminal truncations compared with c-Myb. The c-Myb protein is supposed to function as an activator as well as a repressor of transcription (reviewed in 3 , 4 ), regulating genes important for cell growth and development. Myb expression was initially thought to be largely restricted to the hematopoietic system, but recently, it was also demonstrated to be involved in the regulation of proliferation and differentiation of various types of cells other than hematopoietic cells ( 5 , 6 ).

The c-Myb protein is comprised of three domains responsible for DNA binding, transcriptional activation and negative regulation ( 7 ). The DNA-binding domain of Myb protein is located at the N-terminal side and consists of three homologous tandem repeats of 51 or 52 amino acids (designated as R1, R2 and R3 from the N terminus). Each repeat has three conserved tryptophans spaced 18 or 19 residues apart. This repeat structure with conserved tryptophans seems to be a general motif utilized in many different transcription activators found in a wide spectrum of eukaryotes, including vertebrates, insects, yeast, cellular slime mold and higher plants ( 4 , 8 , 9 ). Among the three repeats of the DNA-binding domain, R1 can be deleted without significant loss of DNA-binding activity ( 7 , 10 ) and plays a minor role in sequence recognition. Domains R2 and R3 are necessary and sufficient for the recognition of specific DNA sequences ( 2 , 7 , 10 ). Thus, the minimum DNA-binding domain is the R2R3 fragment.

The NMR analysis of the Myb DNA-binding domain revealed that the three repeats in the DNA-binding domain have similar overall structures, each containing three helices ( 11 - 13 ). The second and the third helices form a helix-turn-helix (HTH) variation motif, which contains a longer turn than the one in the prototypical HTH motif. The NMR analysis of the Myb-DNA complex showed that R2 and R3 are closely packed in the major groove with each of the third helices acting as the recognition helix ( 12 ). In contrast, R1 was shown to have no specific interactions with DNA ( 12 , 14 ).

The c-Myb protein can activate or repress the transcription of several potential target genes. So far, the known target genes include promyelocyte-specific mim-1 gene ( 15 ), cdc2 gene ( 16 ), gene encoding PR264 splicing factor ( 17 ), c- myc proto-oncogene ( 18 , 19 ) and c- myb itself ( 20 ), as well as the long-terminal repeat (LTR) promoters of the human immunodeficiency virus type-1 (HIV-1) ( 21 ) and the human T-lymphotrophic virus type-I (HTLV-I) ( 22 ). Other genes trans -activated by c-Myb include CD4 ( 23 ), CD34 ( 24 ) and T cell receptor [delta] ( 25 ). In addition, c-Myb also binds to c- erbB -2 ( 26 ), which is the only known gene repressed by c-Myb so far.

The Myb-recognition element (MRE) was originally defined as YAACKG, where K stands for G or T, as derived from comparisons of isolated chicken DNA-binding sites for v-Myb ( 27 ). Then, two extended consensus sequences, the 9-bp YGRCGTTR motif ( 28 ), where Y and R denote pyrimidines and purines, respectively, and the 8-bp YAACKGHH motif ( 29 ), where H denotes A, C or T (i.e. not G), were obtained from binding-site selection protocols.

In order to characterize qualitatively the sequence specificity in c-Myb binding, we have carried out an extensive binding analysis by using the Myb-binding site, MBS-I, in SV40 ( 30 ). By using a synthesized oligonucleotide containing this binding site, the binding affinities to the c-Myb DNA-binding domain were measured by filter-binding assay and the effects of systematic base-pair substitutions on binding affinity were examined ( 14 ). Such analyses have provided valuable information about the location and energetic contributions of specific interactions ( 31 , 32 ). The mutational analyses have shown that the specific interactions are not uniformly distributed in the TAACTGAC region of MBS-I; the 2nd A, the 4th C and the 6th G are involved in very specific interactions with Myb, whereas the interactions at the 3rd A and the 8th C are less specific ( 14 ). A set of binding affinity changes or binding free energy changes, [Delta][Delta] G , due to base substitutions defines a consensus recognition sequence in a quantitative sense.

Although c-Myb is thought to regulate transcription in multiple genes, there is no efficient method to identify its target genes. In this paper, we apply the above binding analysis to approach this problem, by using the experimental [Delta][Delta] G data to predict putative binding sites. In order to test the predictive capacity of this method, we first examined the binding sites in the known target genes and compared these with available experimental data. We have also applied the protocol to seek potential Myb-binding sites in the sequence database, and found a number of binding sites in the promoters of genes that have not yet been verified as targets of Myb by experiments but may be potentially important for the function of c-Myb. Finally, we discuss the multiple-binding mechanism of Myb by comparing the present results with available structural and functional information about Myb.

METHOD OF CALCULATIONS

Previously we carried out an extensive binding analysis on c-Myb ( 14 ). We introduced complete single mutations at each position of the 22mer oligonucleotide containing the MBS-I site in SV40, and measured binding affinities by filter-binding assays. The binding affinity ratio, K (wild) /K (mutant), or the binding free energy change, [Delta][Delta] G = RT ln K (wild) /K (mutant), between the original MBS-I and its mutants can provide information about the location and magnitude of specific interactions within the binding region, as has been shown for Cro and [lambda] repressor ( 31 , 32 ).

The [Delta][Delta] G values for each base position with respect to three substituted bases define a matrix, as shown in Table 1 , which can be used to calculate the binding strength of any sequence as long as the [Delta][Delta] G values are independent (see Discussion). The binding strength for a given segment of sequence can be calculated simply by summing individual contribution of [Delta][Delta] G at each base position (the total free energy change will be denoted as [Delta][Delta] G tot ). Note that positive values of [Delta][Delta] G mean that a mutation weakens the binding, and every 10-fold change in the binding affinity corresponds to a 1.3 kcal/mol change in [Delta][Delta] G. The [Delta][Delta] G tot for each segment in a certain region of DNA can be calculated by moving the window, base by base, along the sequence. In the calculations, both directions were considered, in order to include both strands. In screening for potential Myb binding sites, all the human genes in the sequence database (primate subsection) were searched for potential Myb target sites in promoters and enhancers.

Table 1 . Relative binding free energy changes [Delta][Delta] G (kcal/mol) for the binding of Myb R1R2R3 to MBS-I with base substitutions (14)
Position

G

A

T

C

Preference

1

0.67

0.46

0.00

0.10

y (T or C)

2

4.12

0.00

3.97

3.88

A

3

1.54

0.00

1.31

1.02

a

4

3.60

3.51

3.75

0.00

C

5

-0.04

0.57

0.00

0.27

k (G or T)

6

0.00

4.20

4.32

4.29

G

7

0.09

0.00

0.05

-0.29

N (any bases)

8

4.20

0.22

0.54

0.00

H (not G)

9

0.25

0.00

-0.01

0.38

10

0.22

-0.01

0.12

0.00

11

-0.10

0.00

0.02

-0.02

The letters in the last column show the Myb binding-site motif derived from the [Delta][Delta] G . Upper-case letters indicate higher specificity, while lowercase letters denote less specific preferences.

RESULTS

Screening of binding sites for known target genes

Myb protein has been shown to bind within the promoter regions of several potential target genes, including promyelocyte-specific mim-1 gene ( 15 ), cdc2 gene ( 16 ), gene encoding PR264 splicing factor ( 17 ), c- myc proto-oncogene ( 18 , 19 ), c- myb itself ( 20 ), LTR sequences of HIV-1 ( 21 ) and HTLV-I ( 22 ), c- erbB -2 ( 26 ), and in the enhancer domain of SV40 ( 30 ). These known target genes were screened by the present protocol to look for putative binding sites and then compared with experimental data. All the numberings of the sequences follow the original GenBank entry, and the notations of the putative binding sites follow the corresponding references. SV40. Nakagoshi et al. ( 30 ) found that binding of c-Myb to the simian virus enhancer stimulates transcription. There is a c-Myb-binding site, MBS-I, in the enhancer domain of SV40. We calculated [Delta][Delta] G tot along the whole SV40 sequence. Plotted in Figure 1 A is <[Delta][Delta] G tot > - [Delta][Delta] G tot , so that the larger its value, the stronger the affinity. The highest point is at position 256, corresponding to the binding site MBS-I in SV40 ( 30 ). In Figure 1 B, we draw possible binding sites whose [Delta][Delta] G tot are lower than the threshold value, [Delta][Delta] G threshold . Here, the binding sites on the opposite strand are also included. This figure shows the putative binding sites more clearly. Figure 1 C shows the histogram of binding for the entire SV40 sequence, where the position of [Delta][Delta] G tot for the MBS-I fragment (shown by an arrow) is located near the end in the tail of distribution. This shows the high specificity of c-Myb binding for the MBS-I site.


Figure 1 . ( A ) Binding free energy of c-Myb along the sequence of SV40. The ordinate axis is <[Delta][Delta] G tot > - [Delta][Delta] G tot , where the larger the value, the stronger the binding. The MBS-I site is marked. ( B ) Possible Myb-binding sites in the SV40 enhancer region predicted by the binding-site screening. Here, the ordinate axis is [Delta][Delta] G threshold - [Delta][Delta] G tot , where [Delta][Delta] G threshold is 3 kcal/mol. The binding sites of both strands are shown. This plot shows the putative binding sites more clearly than in (A). ( C ) Histogram of binding for the entire SV40 sequence (GenBank accession number J02400). In this histogram, there are four peaks, centered at energies of 6, 10, 14 and 18 kcal/mol, spaced with 4 kcal/mol. This characteristic is due to the [Delta][Delta] G for each mutation (see Table 1). In the MBS-I sequence, positions 2, 4 and 6 are A, C and G, respectively. Any mutations at these positions cause [Delta][Delta] G to be as high as ~4 kcal/mol. Position eight is C and [Delta][Delta] G is 4.2 kcal/mol if it is mutated to a G. Except at these positions, all the other positions are not as sensitive to the mutations, with quite small [Delta][Delta] G . These insensitive mutations produce a baseline with energy ranging from 1 to 2 kcal/mol. Random sequences can have one, two or three mutations among positions 2, 4 and 6. Thus, this will produce peaks at energies around 6, 10 and 14 kcal/mol. If at the same time position eight is G, these peaks will be shifted to 10, 14 and 18 kcal/mol, respectively. Hence, there are four peaks in the histogram. The peaks at 10 and 14 kcal/mol are higher than the other two, because these peaks always exist regardless of the mutation at position eight.
Mim-1 . Ness et al. ( 15 ) found that the v- myb oncogene product binds to and activates the promyelocyte-specific mim-1 gene of chicken. In the myeloid lineage, expressions of the c- myb and mim-1 genes are perfectly correlated. The promoter of the mim-1 gene, ranging from 444 to 686, contains a cluster of three closely spaced binding sites for v-Myb. A DNase I footprinting assay proved that bacterially expressed v-Myb binds to the three sites centered at positions A(541), B(502) and C(480) with different affinities. Sites A and B showed stronger Myb-binding activity than did site C.

Figure 2 A shows the results from the binding-site search. The three sites A, B and C were found among the screened sites, agreeing with the DNase I footprinting ( 15 ). Interestingly, sites A and B are accompanied by minor sites at their 3' side. This is consistent with the higher binding affinities of these sites. The three binding sites are located within an 80-bp region, and separated by 40 and 20 bp, suggesting that they may face the same side of the double helix. Such features may be exploited to interact with other factors in a cooperative manner. Figure 2 B shows the binding histogram for all the segments in the mim-1 gene. The three arrows indicate the locations of the three putative binding sites. All three sites are located within the lowest 1% of the [Delta][Delta] G tot .


Figure 2 . ( A ) Possible Myb-binding sites in the mim-1 gene promoter region predicted by the binding-site screening. ( B ) Histogram of binding for the entire mim-1 gene (GenBank accession number M29448).
C-myc. C- myc is one of the genes known to be trans -activated by c-Myb ( 18 , 19 ). Transcription of c- myc starts from at least three initiation sites with distinct promoters P0, P1 and P2. Zobel et al. ( 18 ) analyzed the binding activities of the purified proteins with five DNA restriction fragments of the regulatory region by gel-shift assays. The five segments used were F1(2945-3128), F2(2037-2173), F3(1074-1269), F4(875-982) and F5(484-632). The results showed that the affinities for the five segments are different: binding to F3 was the strongest; binding affinities to F1, F4 and F5 were similar, and ~10-fold lower than to F3; and binding to F2 was the lowest. The DNase I footprinting assay identified eight binding sites, denoted as F1-H, F1-L; F3-H, F3-L; F4-H, F4-L and F5-H, F5-L, respectively.

From the binding-site screening of c- myc gene, seven out of the eight binding sites were found (Fig. 3 A). The only missing one is F4-L. In F2, three sites were found but with very low affinity, agreeing with the gel-shift result showing F2 very low binding affinity. F3 has five sites, among which the two strongest sites indeed correspond to F3-H and F3-L. F3-H has a 2-fold symmetric sequence with two AACs, and F3-L accompanies a minor site at the 5' side. Thus, these results agree nicely with the experimental observation that F3 is the strongest Myb-binding fragment ( 18 ). Zobel et al. ( 18 ) used F4 as a negative control, but found its binding activity to be similar to those of F1 and F5 by gel-shift assays. The present screening indeed detected F4-H as a binding site with intermediate binding strength.


Figure 3 . ( A ) Possible Myb-binding sites in the c- myc promoter region predicted by the binding-site screening. ( B ) Histogram of binding for the whole c- myc gene (GenBank accession number X00364).

Zobel et al. ( 18 ) did not examine c-Myb binding sites in the regions (1269-2037) and (2173-2945). On the other hand, Nakagoshi et al. ( 19 ) demonstrated by deletion analysis that a region (1988-2840) of the c-Myb promoter (see dotted line in Fig. 3 A) is sufficient for the Myb-induced trans -activation. They found 10 binding sites (designated as 1-10), which were almost completely protected by 100 ng of c-Myb in the DNase I footprinting analysis. Five out of the 10 sites were found to be strong to medium binding sites by the present binding-site screening (Fig. 3 A). HTLV LTR. Myb activates HTLV-I LTR-mediated transcription ( 22 ). Dasgupta et al. ( 22 ) generated four overlapping fragments of the HTLV-I LTR, denoted as fragment 1 (1-213), fragment 2 (205-447), fragment 3 (426-647) and fragment 4 (545-755). Gel-shift assays showed that fragments 2 and 4 had strong Myb-binding activity while fragment 3 showed weak binding and fragment 1 exhibited very little binding activity. DNase I footprinting analysis of fragments 2, 3 and 4 indicated four binding sites located at positions 329-349 (H1), 400-420 (L1), 583-601 (L2) and 683-702 (H2), where H and L denote high and low affinity, respectively. Both fragments 2 and 4 contain two binding sites, and fragment 3 contains only one binding site.

From the present binding-site screening, three out of the four binding sites were found (Fig. 4 A). The three strongest sites fall within fragment 2, and the two strongest binding sites correspond to H1 and L1 mentioned above, consistent with the observed strong binding activity of this fragment. Both fragments 3 and 4 contain several medium-affinity sites, among which the strongest one corresponds to L1 and belongs to both fragments. However, the present analysis failed to pick up the high-affinity binding site H2 in fragment 4. This is because the eighth base in the CAACGGAG sequence of the H2 site is G, which is unfavorable according to the binding analysis ( 14 ; see also Table 1 ).


Figure 4 . ( A ) Possible Myb-binding sites in the HTLV LTR promoter region predicted by the binding-site screening. ( B ) Histogram of binding for the whole HTLV LTR sequence (GenBank accession number M81248).

Dasgupta et al . ( 21 ) also showed that there is one Myb-binding motif YAACKG in the HIV LTR sequence. They carried out a similar analysis by generating overlapping DNA fragments, in which several putative Myb-binding sites were identified. The present binding-site screening qualitatively agrees with these experimental results (not shown). PR264. Human PR264 promoter sequences contain several MREs ( 17 ). DNase I footprinting analysis revealed 11 Myb-binding sites (denoted A-K). Figure 5 A shows the result from the binding-site screening. Out of the 11 binding sites, nine were found by the screening. Table 2 shows a comparison of the calculated results for each experiment ( 17 ). The calculated results fit well with those from gel-shift and footprint experiments. Site A was indicated by the experiments to be the weakest binding site among all the sites ( 17 ). In the screening, [Delta][Delta] G tot for this site exceeds 3 kcal/mol (Fig. 5 B), and thus was not selected by the prediction. Sites B, C and E are the three strongest sites in Figure 5 A and form the leftmost cluster in the binding histogram (Fig. 5 B), in agreement with the observed strong activity ( 17 ); whereas sites D, F, J and K are lower than those three sites and form the second cluster in the histogram. This agrees with the observed weaker binding affinity ( 17 ). On the other hand, sites G, H and I, which were indicated as strong binding sites by DNase I footprinting assays, were predicted to be weak binding sites. In the case of site H, the fourth position of AAAGCGTT is G instead of the consensus C.


Figure 5 . ( A ) Possible Myb-binding sites in the PR264 promoter region predicted by the binding-site screening. ( B ) Histogram of binding for the PR264 sequence (GenBank accession number L03693).
Cdc2. Ku et al. ( 16 ) showed that there are two MREs in a 465 bp 5-flanking sequence of cdc2 gene, which exhibited promoter activity as revealed by transient CAT expression analysis. Two putative MREs are located very near to 660. However, these binding sites were not detected by the present binding-site search. The sequences of the two binding sites are TAACTATA and TAACCCTA, where the sixth positions are not G, deviating from the general motif yAaCkG. Thus, [Delta][Delta] G tot for these two sites is ~5 kcal/mol. These two sites are spaced very closely, with only 14 bp between the centers of the binding sites. They are also located on the same strand. This may indicate that when Myb binds to adjacent binding sites cooperatively, the recognition at the 6th position may become redundant (see Discussion). There are some binding sites with medium strength in the other region (such as positions at 803 and 833) of the cdc2 promoter, which may also contribute to trans -activation (see the next section).

Table 2 . Comparison of results from screening with for PR264 gene (17)
Binding site

[Delta][Delta] G tot

Gel shift

Footprint

A

4.90

+

+

B

0.40

++

+++

C

0.19

+++

+++

D

0.98

nt

++

E

0.02

nt

+++

F

0.76

nt

++

G

1.67

nt

+++

H

4.82

nt

+++

I

2.17

nt

+++

J

1.22

nt

++

K

1.22

nt

++

[Delta][Delta] G tot are expressed in kcal/mol. The larger the value, the weaker the binding. The experimental relative affinity for each MRE is indicated by +/++/+++ (nt, not tested) (17). C-myb. The c- myb gene itself is autoregulated by Myb protein ( 20 ). Nicolaides et al. ( 20 ) found multiple MREs in the region from 71 to 112, and site I (centered at 110) and site II (centered at 96) are likely to be functionally important. However, neither site I nor site II was found in the present search. Like the cdc2 gene, the two sites contain the sequence TAACNT, which deviates from the consensus at the sixth G position, and have similar [Delta][Delta] G tot s ~5 kcal/mol. Also, these two sites are located very close to each other, with 14 bp between the centers of the binding sites. Thus, the same reasoning as above may explain the discrepancy. Although these two sites were not found in the screening, other sites such as the one centered at 149 were found, which may contribute to c- myb autoregulation. C-erbB-2 . While c-Myb activates the genes mentioned so far, it can also repress the promoter containing MREs. C- erbB -2 is a natural target gene which is the only one known so far to be repressed by c-Myb. Mizuguchi et al. ( 26 ) showed that the DNA binding domain of c-Myb was required and was sufficient for trans -repression of the c- erbB -2 promoter activity. DNase I footprinting analyses revealed that six sites (I-VI) were almost completely protected. Among them, sites I and III are suggested to be responsible for c-Myb-induced trans -repression. From the present binding-site search, only the sites II and IV were found. The other four sites could not be detected because of their deviation from the consensus sequence at the sixth G position; these binding sites have A, T or C at this position.

Screening of potential Myb target genes from the sequence database

C-Myb has been thought to be involved in regulation of transcription in multiple genes. So far, only a limited number of target genes have been demonstrated ( 15 - 26 ). Identification of the target genes is important for understanding the role of transcription factors in many biological events such as development. However, we have had no efficient method to identify them. Usually, a differential screening using the cells expressing or lacking the specific transcription factor is used to identify target genes. However, finding target genes is difficult, as expression can be rapidly induced or modulated only at low level. Furthermore, this method is fairly laborious. Therefore, if some candidates for target genes can be identified by rapid screening procedure it would be very useful. Thus, we have developed and applied the present protocol to screen the sequence database for unknown but potentially important Myb-binding sites.

We have searched the database (primate subsection) for human genes with putative Myb binding sites in regulatory regions which have not yet been identified experimentally. We first examined different criteria for the screening, as shown in Table 3 . Because Myb-binding sites are usually present in multiples, we also incorporated such a condition into the screening. The multi-site condition greatly reduces the number of hits, as shown in Table 3 . We started with a large pool of these binding sites listed in Table 3 , and narrowed down to only those binding sites in the regulatory regions. Then, we further selected only those binding sites that have not been verified by experimentation but may be interesting for the functional aspects of c-Myb, as shown in Table 4 (the screening program and a library of potential Myb-binding sites listing all entries and binding sites in Table 3 are available upon request).

Table 3 . Screening of the binding sites by c-Myb with various screening conditions
Threshold [Delta][Delta] G threshold

No. of binding sites

L a

No. of hits b

0.0

1

2079

1.0

1

20 224

2.0

1

30 145

3.0

1

32 177

4.0

1

32 719

0.0

2

25

8

0.0

2

50

11

0.0

2

100

22

0.0

2

500

76

1.0

2

50

5265

1.0

3

50

446

1.0

4

50

24

1.0

5

50

1

2.0

2

50

22 521

3.0

2

50

28 690

a L represents the size of the region within which the specified number of binding sites are found. b The search was carried out for all sequences in the primate section of GenBank (Rel.91). Thus, the numbers shown here represent all hits including the sites within coding regions.

These candidate genes can be classified into seven groups based on the function of their gene products, as shown in Table 4 . Mammalian c-Myb is required for the G1/S transition in the cell cycle, and Drosophila Myb has recently been speculated to be necessary for G2/M transition ( 33 ). In this sense, four candidates encoding cyclins and cdc25 phosphatase, together with cdc2 mentioned before, are interesting. Recently, a specific region containing two elements called CDE (cell cycle-dependent element) and CHR (cell cycle homology region) in the cyclin A , cdc25C and cdc2 genes was demonstrated to be responsible for the cell cycle-dependent transcription of these genes ( 34 ). Interestingly, our search identified multiple Myb-binding sites adjacent to this region in these genes. Therefore, it is important to examine whether Myb is involved in the cell cycle-dependent transcriptional regulation of these genes through binding to them. The level of c-Myb expression is high in immature hematopoietic cells, and downregulated during differentiation, indicating that c-Myb is important for maintaining the proliferative state of hematopoietic progenitor cells ( 35 ). Those genes classified as proto-oncogenes and their related genes are important for supporting cellular proliferation. Especially, the c- kit gene is a good candidate for the Myb target gene, because it encodes the receptor for stem cell factor that is necessary for growth of immature hematopoietic cells. Among those genes encoding transcription factors, the ets-1 , c- src , and NF-[kappa]B genes encode the transcription factors that positively regulate cellular proliferation, implying that their expression could be activated by c-Myb. In contrast, I[kappa]B-[alpha] and Id2A are inhibitor of NF-[kappa]B and the helix-loop-helix-type factor, respectively. In addition, CP2 is an activator of the [alpha]-globin gene, a typical terminal differentiation marker. Therefore, these genes could be repressed by c-Myb. The genes encoding the negative regulators of cell growth such as tumor suppressor genes are also candidates for c-Myb targets, whose expression is repressed by c-Myb. In addition to the role in maintaining the proliferative state of immature hematopoietic cells, c-Myb is also important for growth control of differentiated cells. For example, the level of c- myb mRNA is transiently increased after stimulation of resting mature T cells by IL-2 ( 36 ). In addition, a similar increase in the c- myb mRNA level is also found in serum stimulated smooth muscle cells ( 5 ). In this aspect, the genes encoding cytokines, their receptors, receptors involved in immune response, and regulators of signal transduction are all candidates for Myb target genes.

DISCUSSION

We have examined the binding site of c-Myb by using the binding free energy data derived from extensive binding analyses. The [Delta][Delta] G values used here define a consensus sequence recognized by c-Myb in a quantitative manner so that they can be used to predict other potential binding sites for c-Myb. However, such a prediction is valid only if the [Delta][Delta] G values are additive with respect upon base substitution. Although the condition holds in most cases for Cro and [lambda] repressor ( 31 , 32 ), it has not been proven for c-Myb. The non-uniform [Delta][Delta] G values over the binding region usually indicate the presence of local contacts between amino acids and base pairs with different interaction strengths. A recent paper ( 37 ) compared such apparent strengths across several different molecular systems and experimental approaches, including sequence variability, and found strong correlations for the strongly interacting base-amino acid pairs. Thus, the individual base-amino acid strengths might be quite general across different molecular systems.

In order to test the predictive capacity of the method in the case of c-Myb, we first examined the binding sites for the known target genes and compared them with the available experimental data. The results showed that our protocol can identify many putative binding sites and potential target genes. On the other hand, there are several cases where binding sites were not predicted by the analysis. Interestingly, a close inspection of these binding sites shows that the second A and fourth C in the consensus sequence, yAaCkGNH, derived from the systematic binding analyses ( 14 ) are almost perfectly conserved, except for F4-L of c- myc , where the fourth C is instead a T. On the other hand, the sixth G is not conserved in several sites of c- myc , c- myb , PR264, cdc2 , c- erbB -2; there seems to be no specific base preference at this position for these sites. Also, the eighth position is occupied by G in a few cases.

Table 4 . Selected list of potential Myb target genes and selected putative binding sites Potential target genesAccession no.Putative binding sites
Potential target genes

Accession no.

Putative binding sites

Cell cycle regulators

cyclin A

X68303

905*/914/928,5017*

cyclin B1

U22364

49*,816/823/829/832

cyclin D1

Z29078

640*/646*

cdc25

Z29077

593*,860/898/917*

p21 (WAF1)

U24170

1658*

Proto-oncogenes and related genes

c- kit

S67773

139*,314/324*/347*/354*

sis

Y00326

4366*/4386*

c-K- ras

X07918

44/65/91

raf-1

M38134

201*,377*/407

c- yes

S59913

158*

c- src

Z18365

29/62

1ck

M36824

201/220/252

ret

D00617

36/73

bcr

X52828

127*,454/467

pim-1

M34228

574/589/598*

Insulin-like growth factor-I

S85346

512*,789/792

Insulin-like growth factor I receptor

M69229

1197/1225*

Transcription factors

ets-1

X63279

1284*/1290*

c- jun

X59744

53*/59,1181*

fra-1

S66884

192/220/249

NF-[kappa]B1

S57113

203*,693/715

NF-[kappa]B2

X83768

39*,117/128

I[kappa]B-[alpha]

U08468

866*/876/879*

Id2A

L31815

317*/342/355/396

CP2 ([alpha]-globin transcription factor)

U01965

330*,420/479

CREB

S53722

1014*/1030

HoxB6

U19111

237/250,1647*

Hox4B

X67079

259*,525/557*/583

GATA-1

X59708

145*/163*

hGATA-3

X73519

78*

Interferon regulatory factor 1

L05078

520*,1183*

Interferon regulatory factor 2

L24442

983*/1004*

Negative regulators of cell growth

P53

J04238

60*/79

Wilms tumor (WT1)

X74840

411*,1126/1131*

Adenomatous polyposis coli (APC)

U02509

202/209*,594*

Fas

X82279

117*/166/185,785*

2-5A synthetase

X07179

805*/858*

gas-1

Z22667

1415*,1473*/1477

Cytokines and their receptors

GM-CSF

L07488

598*/613

Table 4. continued Potential target genesAccession no.Putative binding sites
IL-2

X67285

6632/6637*/6672*

IL-10

Z30175

95*,963/968

RANTES pro-inflammatory cytokine

S64885

647*,842*/854*/882*

IL-1[beta]

U26540

703/751*,4038*

IL-1 receptor I

L09701

720/744*

ST2 (IL-1 receptor-like)

S74267

311/356

IL-2 receptor [beta]

X53093

706/717

IL-2 receptor [gamma]

D16358

209/259

IL-5 receptor [alpha]

U18373

316*/323/365/386

IL-8 receptor type A

U11870

207/278*

IL-8 receptor type B

U11866

332/353*/366

[beta]-interferon

M11286

25*

Receptors involved in immune response

T-cell receptor [alpha]

M30267

177*/196*

T-cell receptor [beta]

X59486

99/106/136

T-cell receptor [gamma]

S71037

164/174,480*

T-cell receptor [zeta] subunit

U14115

225/273,311*

LAG-3 (related to CD4)

X51984

353/376*,680*

Monocyte LPS receptor CD14

U00699

418/445*

Regulators of signal transduction

Protein kinase C [gamma]

X62533

307/337

Cytosolic phospholipase A2 (cPLA2)

U08374

169*,1368/1374

Thromboxane A2 receptor

S66904

1043/1094

Inducible nitric oxide synthase

D29675

796/821/841*

Endothelin-A receptor

D11144

541/565*

The accession number and the numbering of the binding sites follows the GenBank entry. The numbering of the binding sites indicates the center of the corresponding binding sites (the fifth position in Table 1). *Binding sites where the [Delta][Delta] G tot value is <1.0 kcal/mol (thus, stronger binding expected). For the multiple binding sites in which there are more than two sites within a 50-bp region, numbers are separated by `/', whereas they are separated by a comma otherwise. We listed only one single-binding site and one multiple-binding site with the lowest [Delta][Delta] G tot if there is more than one such site.

The structure of the Myb-DNA complex solved by NMR analysis ( 12 ) shows that Asn-183 and Lys-182 from R3 interact with the second A and fourth C, respectively; whereas the sixth G interacts with Lys-128 from R2. Thus, the interaction of R3 appears to be conserved throughout these binding sites. On the other hand, the interaction of R2 for these binding sites appears to be different from the other sites; R2 seems not to specify bases at the sixth position, so that this interaction may be absent in binding to these sites.

We cannot preclude the possibility that some of the above binding sites are not real targets for c-Myb, but the significant deviation from the consensus sequence for a number of sites indicates that the [Delta][Delta] G values may not be strictly additive. The breakdown of the additivity may be mainly attributed to (i) cooperative interactions within single binding site, or (ii) cooperative interactions among different binding sites. As an example of the former case, even small conformational changes in the DNA by a base substitution could affect substitutions at other positions in the same site. On the other hand, multiple Myb-binding sites are often present in promoters, and Myb molecules may interact with themselves or other molecules in the transcriptional machinery. Therefore, we can naturally expect a certain level of cooperativity in the binding of c-Myb to DNA. In fact, it has been reported that a oligonucleotide sequence containing duplicated Myb-binding motif showed a higher affinity than the sequence with only a single Myb-binding motif ( 10 ).

The good correlation between the observed interactions in the myb-DNA complex structure and those base positions indicated by the binding analysis to have specific interactions indicates that no major problem arising in the additivity of the individual [Delta][Delta] G s because of the cooperativity within single binding site is anticipated. Rather, we suggest that c-Myb may use different modes of binding in recognizing the multiple binding sites of the target promoters. As shown earlier, there are multiple potential binding sites, some of which are closely located or spaced with a period of 10 bp. The proximity and/or phasing of binding sites may cause cooperative interactions of multiple Myb molecules via direct contact, DNA bending, or through other transcription factors. When c-Myb binds to multiple binding sites cooperatively, the binding mode of R2 could be different from that in the case of single-site binding.

The present protocol was also applied to the screening of potential binding sites of c-Myb that have not been verified by experiments, and we have been able to identify some additional interesting binding sites by the screening, that can be targets of c-Myb. It should be emphasized that certain classes of sequences mentioned above escape the present screening. Thus, the screened results represent a subset library of potential Myb-binding sites, and they are by no means a complete library. The screening protocol is straightforward given the [Delta][Delta] G values, and parameters such as the [Delta][Delta] G threshold and the number of binding sites within a specified range can be varied to control the screening. The binding-site library will of course grow with the rapidly increasing sequence database. The present method may provide useful information on the function of c-Myb as a transcriptional regulator for multiple genes.

ACKNOWLEDGEMENTS

We thank Drs Kazuhiro Ogata and Robert L. Jernigan for helpful comments on the manuscript. This work was partly supported by the Grants-in Aid for Scientific Research on Priority Areas from the Ministry of Education, Science and Culture of Japan.

REFERENCES

1 Moelling,K., Pfaff,E., Beng,H., Beimling,P., Bunte,T., Schaller,H.E. and Graf,T. (1985) Cell, 40, 983-990. MEDLINE Abstract

2 Klempnaner,K.-H. and Sippel,A.E. (1987) EMBO J., 6, 2719-2725. MEDLINE Abstract

3 Graf,T. (1992) Curr. Opin. Genet. Dev., 2, 249-255. MEDLINE Abstract

4 Lüscher,B. and Eisenman,R.N. (1990) Genes Dev., 4, 2235-2241. MEDLINE Abstract

5 Brown,K.E., Kindy,M.S. and Sonenshein,G.E. (1992) J. Biol. Chem., 267, 4625-4630. MEDLINE Abstract

6 Simons,M., Edelman,E.R., Dekeyyser,J.-L., Langer,R. and Rosenberg,R.D. (1992) Nature, 359, 67-70. MEDLINE Abstract

7 Sakura,H., Kanei-Ishii,C., Nagase,T., Nakagoshi,H., GondaT.J. and Ishii,S. (1989) Proc. Natl Acad. Sci. USA, 86, 5758-5762. MEDLINE Abstract

8 Frampton,J., Gibson,T.J., Ness,S.A., Doderlein,G. and Graf,T. (1991) Protein Engng, 4, 891-901.

9 Stober-Grasser,U., Brydolf,B., Biu,X., Grasser,F., Firtel,R.A. and Lipsick,J.S. (1992) Oncogene, 7, 589-596.

10 Howe,K.M., Reaks,C.F.L. and Watson,R.J. (1990) EMBO J., 9, 161-169. MEDLINE Abstract

11 Ogata,K., Hojo,H., Aimoto,S., Naki,T., Nakamura,H., Sarai,A., Ishii,S. and Nishimura,Y. (1992) Proc. Natl Acad. Sci. USA, 89, 6428-6432. MEDLINE Abstract

12 Ogata,K., Morikawa,S., Nakamura,H., Sekikawa,A., Inoue,T., Kanai,H., Sarai,A., Ishii,S. and Nishimura,Y. (1994) Cell, 79, 639-648. MEDLINE Abstract

13 Ogata, K., Morikawa, S., Nakamura, H., Hojo, H., Yoshimura, S., Zhang,R., Aimoto,S., Ametani,Y., Hirata,Z., Sarai,A., Ishii,S. and Nishimura,Y. (1995) Nature: Struct. Biol., 2, 309-320.

14 Tanikawa,J., Yasukawa,T., Enari,M., Ogata,K., Nishimura,Y., Ishii,S. and Sarai,A. (1993) Proc. Natl Acad. Sci. USA, 90, 9320-9324. MEDLINE Abstract

15 Ness,S.A., Marknell,A. and Graf,T. (1989) Cell, 59, 1115-1125. MEDLINE Abstract

16 Ku,D.-H., Wen,S.-C., Engelhard,A., Nicolaides,N.C., Lipson,K.E., Marino,T.A. and Calabretta,B. (1993) J. Biol. Chem., 268, 2255-2259.

17 Sureau,A., Soret,J., Vellard,M., Crochet,J. and Perbal,B. (1992) Proc. Natl Acad. Sci.USA, 89, 11683-11687.

18 Zobel,A., Kalkbrenner,F., Guehmann,S., Nawrath,M., Vorbruneggen,G. and Moelling,K. (1991) Oncogene, 6, 1397-1407. MEDLINE Abstract

19 Nakagoshi,H., Kanei-Ishii,C., Sawazaki,T., Mizuguchi,G. and Ishii,S. (1992) Oncogene, 7, 1233-1240. MEDLINE Abstract

20 Nicolaides,N.C., Gualdi,R., Casadevall,C., Manzella,L. and Calabretta,B. (1991) Mol. Cell. Biol., 11, 6166-6176. MEDLINE Abstract

21 Dasgupta,P., Saikumar,P., Reddy,C.D. and Reddy,E.P. (1990) Proc. Natl Acad. Sci. USA, 87, 8090-8094.

22 Dasgupta,P., Reddy,C.D., Saikumar,P. and Reddy,E.P. (1992) J. Virol., 66, 270-276. MEDLINE Abstract

23 Siu,G., Wurster,A.L., Lipsick,J.S. and Hedrick,S.M. (1992) Mol. Cell. Biol., 12, 1592-1604. MEDLINE Abstract

24 Melotti,P., Ku,D-H. and Calabretta,B. (1994) J. Exp. Med., 179, 1023-1028. MEDLINE Abstract

25 Hernandez-Munain,C. and Krangel,M.S. (1994) Mol. Cell. Biol., 14, 473-483.

26 Mizuguchi,G., Kanei-Ishii,C., Takahashi,T., Yasukawa,T., Nagase,T., Horikoshi,M., Yamamoto,T. and Ishii,S. (1995) J. Biol. Chem., 270, 9384-9389. MEDLINE Abstract

27 Biedenkapp,H., Borgmeyer,U., Sippel,A.E. and Klempnauer,K.-H. (1988) Nature, 335, 835-837. MEDLINE Abstract

28 Howe,K.M. and Watson,R.J. (1991) Nucleic Acids Res., 19, 3913-3919. MEDLINE Abstract

29 Weston,K. (1992) Nucleic Acids Res., 20, 3043-3049. MEDLINE Abstract

30 Nakagoshi,H.N., Nagase,T., Kanei-Ishii,C., Ueno,Y. and Ishii,S. (1990) J. Biol. Chem., 265, 3479-3483. MEDLINE Abstract

31 Takeda,Y., Sarai,A. and Rivera,V.M. (1989) Proc. Natl Acad. Sci.USA, 86, 439-443.

32 Sarai,A. and Takeda,Y. (1989) Proc. Natl Acad. Sci. USA, 86, 6513-6517 MEDLINE Abstract

33 Gewirtz,A.M., Anfossi,G., Venturelli,D., Valpreda,S., Simn,R. and Calabretta,B. (1989) Science, 245, 180-183. MEDLINE Abstract

34 Zwicker,J., Lucibello,F., Wolfraim,L.A., Gross,C., Truss,M., Engeland,K. and Müller,R. (1995) EMBO J., 14, 4514-4522. MEDLINE Abstract

35 Gonda,T.J. and Metcalf,D. (1984) Nature, 310, 249-251. MEDLINE Abstract

36 Stern,J.B. and Smith,K.A. (1986) Science, 233, 203-206. MEDLINE Abstract

37 Lustig,B. and Jernigan,R.L. (1995) Nucleic Acids Res., 23, 4707-4711. MEDLINE Abstract


Return

* To whom correspondence should be addressed
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J. Biol. Chem.Home page
M. M. Luchetti, P. Paroncini, P. Majlingova, J. Frampton, M. Mucenski, S. S. Baroni, P. Sambo, J. Golay, M. Introna, and A. Gabrielli
Characterization of the c-Myb-responsive Region and Regulation of the Human Type I Collagen alpha 2 Chain Gene by c-Myb
J. Biol. Chem., January 10, 2003; 278(3): 1533 - 1541.
[Abstract] [Full Text] [PDF]


Home page
Mol. Pathol.Home page
C Gaillard, E Le Rouzic, C Creminon, and B Perbal
Alteration of C-MYB DNA binding to cognate responsive elements in HL-60 variant cells
Mol. Pathol., October 1, 2002; 55(5): 325 - 335.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Berge, S. L. Bergholtz, K. B. Andersson, and O. S. Gabrielsen
A novel yeast system for in vivo selection of recognition sequences: defining an optimal c-Myb-responsive element
Nucleic Acids Res., October 15, 2001; 29(20): e99 - e99.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
R. G. Ramsay, A. Friend, Y. Vizantios, R. Freeman, C. Sicurella, F. Hammett, J. Armes, and D. Venter
Cyclooxygenase-2, a Colorectal Cancer Nonsteroidal Anti-inflammatory Drug Target, Is Regulated by c-MYB
Cancer Res., April 1, 2000; 60(7): 1805 - 1809.
[Abstract] [Full Text]


Home page
J. Biol. Chem.Home page
K. B. Andersson, T. Berge, V. Matre, and O. S. Gabrielsen
Sequence Selectivity of c-Myb in Vivo. RESOLUTION OF A DNA TARGET SPECIFICITY PARADOX
J. Biol. Chem., July 30, 1999; 274(31): 21986 - 21994.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
T. L. Brandt, D. J. Fraser, S. Leal, P. M. Halandras, A. R. Kroll, and D. J. Kroll
c-Myb trans-Activates the Human DNA Topoisomerase IIalpha Gene Promoter
J. Biol. Chem., March 7, 1997; 272(10): 6278 - 6284.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (118K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (18)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Deng, Q.
Right arrow Articles by Sarai, A
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Deng, Q.
Right arrow Articles by Sarai, A
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?