ABSTRACT
The human [zeta]-globin promoter contains a strong positive regulatory element in the 5' flanking region, designated the [zeta]-globin upstream regulatory element (URE). In this study, we define the minimal sequences required for URE function and characterize the associated protein-DNA interactions. Deletion experiments show that the URE spans a 60 bp region located between 220 and 279 bp 5' to the transcription start site. Further subdivision of this region shows that multiple cis acting sequences are present. Electrophoretic mobility shift assays demonstrate that the erythroid transcription factor GATA-1 binds a site at -230, and Sp1 and an unidentified factor bind a CCACC site at -240. The unidentified CCACC factor is distinct from two other CCACC factors, EKLF and BKLF/TEF-2. A third complex contains a novel DNA-binding activity that interacts with a site in the -269 to -255 region, designated URE binding factor (URE-BF). This factor is present in K562 cells that express [zeta]-globin, but is absent in the OCIM1 cell line, a human erythroid cell line that does not express [zeta]-globin. URE-BF appears to interact with a GATA factor, since formation of the URE-BF complex can be prevented by the presence of unlabeled oligonucleotides containing GATA sites. Finally, increasing the distance from the -230 GATA site to the two upstream sites causes a progressive decrease in [zeta]-globin promoter activity. There is no indication of a requirement for GATA-1 to be on the same side of the DNA helix as the other upstream factors. These results show that [zeta]-globin promoter function is highly dependent on a 60 bp region to which at least three different factors bind. Two of these factors may represent DNA-binding proteins not previously identified as important for regulation of globin gene expression. It is likely that these factors interact physically to create a functional regulatory unit.
During human embryonic development, hematopoiesis takes place in two distinct compartments. During the first 6 weeks of gestation, hematopoiesis begins in the yolk sac blood islands. After 6 weeks of development, the main site of hematopoiesis is the liver (1 ). These two stages of hematopoiesis are referred to as primitive and definitive, respectively. Primitive and definitive erythrocytes differ both morphologically and in the sets of globin genes expressed. Primitive cells are nucleated and contain hemoglobin comprised of [zeta] and [epsilon] globin, the embryonic forms of [alpha] and [beta] globin, respectively. In contrast, definitive cells are enucleated and contain fetal hemoglobin, which is comprised of [alpha] and [gamma] globin (2 ). Recent evidence suggests that separate sets of progenitor cells give rise to primitive and definitive hematopoietic cells (3 ).
To gain better understanding of the mechanisms regulating gene expression during primitive hematopoiesis, we have been studying transcriptional regulation of the human [zeta]-globin gene. Two important regulatory regions have been identified in the 5' flanking region of the [zeta]-globin gene. First, experiments in transgenic mice have shown that as little as 85 bp of the proximal [zeta]-globin promoter are sufficient for embryonic-specific expression (4 ,5 ), although elements within the gene and in the 3' flanking region have also been shown to affect developmental specificity (6 ). Second, a positive regulatory element located between 207 and 417 bp 5' to the transcription start site is required for high-level [zeta]-globin promoter activity in transiently transfected K562 cells (7 ) as well as in stably transfected cells (unpublished results). This upstream regulatory element (URE) does not appear to regulate [zeta]-globin expression during development, since it can be deleted without disrupting embryonic-specific expression in transgenic mice (4 ,5 ).
In a previous study, we showed that a GATA site at -230 was only partially responsible for URE activity, since deletion of the URE decreased promoter activity by 90%, while mutation of the -230 GATA site decreased promoter activity by at most 50% (7 ). In addition to the GATA site, a CCACC site was identified at -240, but mutation of the CCACC site resulted in no decrease in promoter activity. Finally, a promoter with both the CCACC and GATA mutations had no less activity than a promoter with the GATA mutation alone. We thus concluded that some other element in addition to the GATA site was necessary for full URE activity.
In this paper, we localize the [zeta]-globin URE to the region between -279 and -220. This region contains the GATA and CCACC sites as well as a third cis element that is localized to the -279 to -240 region. Using electrophoretic mobility shift assays and antibodies against specific transcription factors, we show that the only proteins present in erythroid cell line nuclear extracts that bind this region are GATA-1, a CCACC-binding activity other than Sp1, a small amount of Sp1, and an activity that binds between -269 and -255 (designated URE-BF). URE-BF is present in K562 cells, which express [zeta]-globin, but is absent in OCIM1 cells, which do not. We also show that there is a decrease in [zeta]-globin promoter activity as the distance between the two upstream binding sites and the GATA element is increased. These results indicate that activity of the URE is dependent on a combination of transcription factors interacting with appropriately spaced cis sequences. In addition, since URE-BF is present in cells that express [zeta]-globin but is absent in cells that do not, URE-BF may be partially responsible for developmental specificity of [zeta]-globin expression.
K562 cells and OCIM1 cells were maintained as described previously (7 ).
To facilitate the construction of [zeta]-globin promoter mutants, a modified [zeta]-luciferase construct designated p[zeta]*Luc was made that had unique restriction sites placed at convenient intervals throughout the 5' flanking region: a HindIII site at -557, a BssHII site at -420, a BamHI site at -240, a PstI site at -85 and a BglII site at +38. Deletion constructs were made by substituting various segments of the [zeta]-globin 5' flanking region into this plasmid. The HS-40 enhancer was blunt-end ligated into the unique HindIII site. For the URE insertion experiment, a version of pHS-40[zeta]*Luc, designated pHS-40[zeta]**Luc, was made that had a unique XhoI site at -260 rather than the unique BamHI site at -240. Five and 10 bp insertions that introduced a new unique NcoI site were inserted between bases -235 and -236. The NcoI site was used to insert single or multiple 10 bp oligonucleotides to create the desired insertions. The DNA sequences of the inserted segments are shown in Figure 1 A. Details of plasmid construction can be found at http://www.labmed.washington.edu/Faculty/plasmids.html
Transient transfection of K562 cells was performed as described previously (7 ), using an ECM 600 electroporator (BTX, San Diego, CA) with settings of 300 V, 1150 [mu]F, and 720 [Omega]. Protein, [beta]-galactosidase, and luciferase assays were performed as described previously (7 ) except that cells were lysed in Reporter Lysis Buffer (Promega, Madison, WI) according to the method of the manufacturer, and the luciferase assay was performed by placing 20 [mu]l of cell extract in a cuvette into which a luminometer (Analytical Luminescence Laboratory, Ann Arbor, MI) injected 100 [mu]l of luciferase assay buffer (20 mM tricine, 1.07 mM [MgCO3]4 Mg[OH]2.5H2O, 2.67 mM MgSO4, 0.1 mM EDTA, 33.3 mM DTT, 270 [mu]M coenzyme A [Sigma Chemical Co., St. Louis, MO], 470 [mu]M luciferin [Analytical Luminescence], 530 [mu]M ATP [Sigma], pH 7.8), and measuring light output for 30 s.
Table 1
Nuclear extracts were prepared and EMSA were performed as described previously (7 ). Synthetic oligonucleotides were purchased from Universal DNA (Tigard, OR) or Life Technologies (Gaithersburg, MD). Double-stranded oligonucleotides were generated by annealing complementary single-stranded oligonucleotides or by annealing short primers to longer oligonucleotides and synthesizing the second strand with Klenow DNA polymerase as described (7 ). Radiolabeled probes were generated by internal labeling with Klenow DNA polymerase or by end-labeling double-stranded oligonucleotides with T4 polynucleotide kinase as described (7 ). Oligonucleotide sequences and the methods used to generate double-stranded probes and competitors are listed in Table 1 . For supershift experiments, nuclear extracts and probes were incubated for 30 min at room temperature. Antibodies were added and the binding reactions were carried out overnight at 4oC. Antibodies against GATA-1, GATA-2, and Sp1 were obtained from Santa Cruz Biotechnology (Santa Cruz, CA). Recombinant GST-EKLF (9 ) was a gift of Dr James Bieker (Mt. Sinai School of Medicine, New York, NY). Anti-BKLF antiserum, anti-EKLF antiserum and preimmune sera (10 ) were gifts of Dr Merlin Crossley (University of Sydney, Sydney, NSW, Australia).
In a previous study (7 ), we showed that deletion of the upstream regulatory element (URE), located -417 to -207 relative to the [zeta]-globin transcription start site, resulted in >90% loss in promoter activity. That study also showed that, in addition to the GATA site at -230, other sequences were responsible for activity of the URE. Therefore, we made [zeta]-globin promoter/luciferase constructs in which various segments of the -417 to -207 region were deleted to determine where other positive cis-acting elements might be located. The various deletion constructs are illustrated in Figure 1 . A 1.4 kb DNA segment containing the HS-40 enhancer (11 ) was included in all constructs since we showed previously that the presence of this enhancer increases the signal-to-noise ratio of the experiments without altering the relative activity of the various linked promoters (7 ). K562 cells were cotransfected with the various luciferase constructs along with an SV40-[beta]-galactosidase construct; luciferase activity was normalized to [beta]-galactosidase activity to correct for transfection efficiency. Results are given as a percentage of wild-type [zeta]-globin promoter activity.
As shown in Figure 1 A, deletion of the entire URE (-417 to -207) resulted in a 93% decrease in [zeta]-globin promoter activity compared with the wild-type promoter. A deletion from -420 to -240, which eliminated the -240 CCACC site but not the -230 GATA site caused a 69% decrease in promoter activity. A smaller deletion spanning -420 to -279 had no effect on promoter activity, while deletion from -279 to -240 caused a 30% decrease in promoter activity. No loss in promoter activity was seen with deletions from -420 to -336, or from -336 to -279 (not shown). Finally, a deletion from -240 to -207, which eliminated both the CCACC and GATA sites, caused an 82% decrease in promoter activity. These initial experiments indicate that the URE has functional elements located between -279 and -240 and between -240 and -207.
The most likely mechanism for URE activity is the binding of specific transcription factors to sites within the -279 to -240 and -240 to -220 regions. To test this hypothesis, electrophoretic mobility shift assays were performed using probes spanning -279 to -220. We first examined the -255 to -220 segment using probes containing wild-type or mutant GATA and CCACC sites. Four DNA complexes, designated A-D, were formed using the wild-type -255 to -220 probe and K562 nuclear extracts (Fig. 2 A and B). The specificity of these complexes was confirmed by the ability of a nonradioactive wild-type oligonucleotide to compete for binding (Fig. 2 A, lanes 3 and 4). A nonspecific binding activity comigrating with complex B was seen as well (asterisks). When a competitor oligonucleotide was used that contained a mutation in the -230 GATA site, complexes A and C, but not B and D, were competed off (Fig. 2 A, lanes 5 and 6). Conversely, when a competitor was used that contained a mutation in the -240 CCACC site, complexes B and D, but not complexes A and C, were competed off (Fig. 2 B, lanes 3 and 4). Complex D appeared to be competed off more effectively than complex B. A competitor with mutations in both the -240 CCACC and -230 GATA sites did not compete for binding to any of the complexes (Fig. 2 B, lanes 5 and 6). When the oligonucleotide containing both CCACC and GATA mutations was used as a probe, none of the complexes A-D was formed (not shown). When a competitor was used that contained the -105 GATA site from the proximal [zeta]-globin promoter, complexes B and D, but not A and C were competed off (Fig. 2 A, lanes 7 and 8). We have previously shown that the -105 GATA site has strong GATA-binding activity (7 ). When an oligonucleotide containing wild-type sequences from -241 to -220, including the GATA but not CCACC site, was used as a competitor, complexes B and D but not A and C were again competed off (Fig. 2 A, lanes 9 and 10). No competition was observed when the -241 to -220 oligonucleotide containing a GATA site mutation was used as a competitor (Fig. 2 A, lanes 11 and 12). In a parallel experiment, a competitor oligonucleotide containing wild-type sequences from -255 to -234, including the CCACC but not the GATA site, prevented formation of complexes A and C, but not B and D (Fig. 2 B, lanes 9 and 10). Again, a -255 to -234 oligonucleotide with a CCACC mutation did not prevent formation of any of the complexes (Fig. 2 B, lanes 11 and 12). A competitor oligonucleotide containing a consensus Sp1 binding site was able to partially compete for formation of complex C; it was better able to compete for formation of complex A (Fig. 2 B, lanes 7 and 8).
Since three distinct protein binding sites exist within the 60 bp span of the URE, we considered the possibility that spacing of protein binding sites within the URE was important for protein-protein interactions and URE activity. Furthermore, the [zeta]-globin promoter contains two GATA sites, one at -230 and a second at -105, both of which are separated from a neighboring CCACC site by 10 bp (one turn of the DNA double helix). This suggests that GATA and CCACC factors may need to bind the same side of the DNA helix. A similar juxtaposition of GATA and CCACC sites exists in the porphobilinogen deaminase promoter (18 ), although no other globin genes have regulatory elements with GATA and CCACC sites separated by 10 bp.
To test the hypothesis that the spacing of protein sites in the URE is important for URE activity, a series of constructs was made in which the GATA site was separated from the CCACC and URE-BF sites by increasing 5 bp increments up to 50 bp. Each increment represents one-half turn of the DNA helix. The sequences of the inserted segments are shown in Figure 5 A. These constructs were transiently transfected into K562 cells, and the results are shown in Figure 5 B.
Insertion of 5 and 10 bp caused modest decreases in [zeta]-globin promoter activity (28 and 18%, respectively), although the differences in activity from the wild-type promoter were probably not significant. However, with insertion of 15 bp, promoter activity was decreased by 70%. No further decrease in promoter activity was obtained with insertion of up to 35 bp. However insertion of 40 or 50 bp caused a 90% reduction in promoter activity, comparable with that seen with deletion of the URE. There was no greater effect on [zeta]-globin promoter activity when the GATA site was displaced from the upstream sites by insertions causing half-turns of the DNA helix (10n + 5 bp, with n = 0, 1, 2, 3) than by insertions causing full turns (10n bp).
The ancestral globin genes of the [alpha]- and [beta]-globin gene clusters diverged 450 million years ago (19 ), before evolution of the globin switching phenomenon. Since this divergence, both gene clusters have acquired genes that are expressed specifically during primitive erythropoiesis: [zeta] in the [alpha]-globin cluster and [epsilon] in the [beta]-globin cluster. While the two human embryonic globin genes are expressed at the same time during development, they have evolved distinct mechanisms for this pattern of expression. The [epsilon]-globin gene has been shown to have a developmental silencer (20 ,21 ); no such negative regulatory element has been demonstrated for the [zeta]-globin gene. The [zeta]-globin gene has an upstream element, the URE, that is necessary for high level promoter activity (7 ); the [epsilon]-globin gene has a number of positive regulatory elements in its 5' flanking region, but none shares sequence similarity with the [zeta]-globin URE (22 ). In this paper, we have extended our previous study by defining in detail the location and protein-DNA interactions of the [zeta]-globin URE.
Using a series of constructs containing progressively smaller deletions, we were able to localize the URE between -279 and -220 relative to the [zeta]-globin transcription start site. Some additional loss in [zeta]-globin promoter activity was observed when the deletion was extended by 13 bp to -207. However, deletion of these 13 bp by themselves had no effect on promoter activity, and no DNA-protein interactions were detected in this region (data not shown). An explanation for the lower activity of the -279 to -207 deletion compared with the -279 to -220 deletion is not apparent at this time.
The URE sequence contains two sites that have been shown to be important for globin gene expression: a GATA site at -230 and a CCACC site at -240, both on the antisense strand. In this paper, we have demonstrated specific protein-DNA interactions occurring at both sites. In our previous study we demonstrated a GATA-binding activity at -230 and only a very weak CCACC-binding activity at -240 using 20 bp oligonucleotide probes (7 ). When a 35 bp oligonucleotide probe was used that contained both the CCACC and GATA sites, the CCACC-binding activity became more apparent. This result might suggest that the CCACC factor requires physical interaction with the GATA factor for binding. However mutation of the GATA site does not reduce binding of the CCACC factor to the 35 bp probe. The major GATA-binding activity is GATA-1, since antibodies against GATA-1 inhibit formation of the GATA complex, while antibodies against GATA-2 have no effect. Other GATA factors have not been shown to be important in erythroid cells (23 ). Sp1 is a minor component of the CCACC-binding activity; an additional CCACC-binding protein makes up the major part of the CCACC-binding activity. Antibodies to EKLF and BKLF/TEF-2 had no effect on formation of the CCACC complex, and recombinant EKLF was unable to bind the URE CCACC site. Thus the [zeta]-globin URE CCACC activity may represent an as yet undescribed CCACC-binding factor.
While a specific CCACC-binding factor is present in K562 nuclear extracts, the physiological significance of this interaction is not clear. We showed previously that mutation of the GATA site decreased URE activity by 50%, while mutation of the CCACC site had no negative effect on URE activity (7 ). In the current study, we have shown that these mutations abolish in vitro binding of the respective factors to their cognate sites. Thus, the physiological significance of the CCACC site and its corresponding binding factor is not clear. One might speculate that the CCACC factor serves to stabilize the interaction of GATA-1 and URE-BF with the URE, and that this stabilization is not critical for URE activity in a transient transfection system. We are currently investigating the importance of the CCACC site in a stable transfection system. Interestingly, whereas the GATA site is conserved between human (24 ), mouse (25 ), goat (26 ), horse (27 ), and rabbit (28 ) [zeta]-globin 5' flanking regions, the CCACC site is unique to the human [zeta]-globin gene. This suggests it may have evolved recently and may be dispensable for [zeta]-globin gene function. We have observed that the two GATA sites in the human [zeta]-globin promoter (at -230 and -105) are each separated from CCACC sites by exactly 10 bp, representing one turn of the DNA helix. Separation of the -230 GATA site from the -240 CCACC site results in decreased promoter activity. However, it is not clear whether the decrease in promoter activity is due to separation of the GATA site from the CCACC site or separation from other sites 5' to the CCACC site, such as the URE-BF site. A similar result was obtained when the GATA site in the porphobilinogen deaminase promoter was separated from its neighboring CCACC site (18 ). In neither case is there evidence that the GATA and CCACC factors need to be on the same side of the DNA helix. A requirement for the GATA and CCACC factors to be `in phase' would be inferred if insertions that caused half-turn rotations of the helix (10n + 5 bp, where n = 0, 1, 2, etc.) caused marked decreases in promoter activity, but insertions that caused full-turn rotations of the helix (10n) restored promoter activity. In contrast, such `phasing' is required in a regulatory element upstream of the rat tryptophan oxidase gene, between a glucocorticoid receptor binding site and a CCACC site (29 ).
Because we were unable to account for full URE activity based on only the CCACC and GATA sites, we searched the URE region for other possible cis-acting sequences. The deletion experiments showed that sequences between -279 and -240 contribute to URE function. Consequently we were able to demonstrate a factor present in K562 nuclear extracts that specifically interacts with this region, most likely between -269 and -255. We have designated this activity URE-BF. The finding that oligonucleotides containing GATA sites, but not oligonucleotides with mutant GATA sites, interfere with the formation of the URE-BF complex might suggest that URE-BF is simply GATA-1. However, there are a number of pieces of evidence arguing against this conclusion. First, the -269 to -255 region contains no GATA site, so it is unlikely that a GATA factor is interacting directly with this site. Second, we have observed that URE-BF is present in K562 cells, but absent in OCIM1 cells; GATA-1 is present in both cell lines. Finally, antibodies against GATA-1 and GATA-2 have no effect on URE-BF binding. We considered the possibility that URE-BF is an altered form of one of the GATA factors that lacks the epitope(s) recognized by the antibodies. However, this is unlikely because the factor is absent in OCIM1 cells.
We favor the alternative hypothesis that URE-BF interacts directly with GATA-1 or another GATA factor. In addition to the oligonucleotide competition data, the finding that changing the spacing between the GATA and URE-BF sites decreases [zeta]-globin promoter activity suggests a possible physical interaction. GATA-1 has been shown to physically interact with a number of zinc finger-containing transcription factors including itself (30 ), the estrogen receptor (31 ), Sp1 and EKLF (32 ). Thus it is likely that GATA-1 interacts with other as yet uncharacterized transcription factors. One model for a putative GATA-1-URE-BF interaction is that GATA-1 is required for URE-BF to form a complex with its binding site. When an oligonucleotide containing a GATA site is present, GATA-1 would have higher affinity for its DNA binding site than for URE-BF, GATA-1 and URE-BF would dissociate, and URE-BF would no longer bind its site. An alternative model would be one in which GATA-1 in a complex with DNA has high affinity for URE-BF, and that a GATA-1-DNA-URE-BF complex is formed preferentially over a URE-BF-DNA complex. These models will be easier to test when additional biochemical characterization of URE-BF is undertaken.
One final point to consider regarding URE-BF concerns its presence in K562 cells and absence in OCIM1 cells. Since the former cells express [zeta]-globin, and the latter cells do not (7 ), URE-BF could be the factor that is responsible for the differential expression of [zeta]-globin in these two cell lines. It is tempting to speculate that this factor could be partly responsible for embryonic-specific expression of [zeta]-globin. URE-BF cannot be fully responsible for developmental-specific expression, since promoters with 128 or 85 bp of proximal promoter are sufficient for embryonic-specific activity of the [zeta]-globin promoter in transgenic mice (4 ,5 ), and since other developmental regulatory elements are present within the gene and in its 3' flanking region (6 ). However, URE-BF could synergize with some of these elements to contribute to developmental specificity. Again, additional characterization of URE-BF will be necessary to test this hypothesis.
We thank Dr James Bieker for generously providing recombinant EKLF and Dr Merlin Crossley for providing antisera. Thanks to Drs Mark Groudine and Ken Peterson for their careful reading of the manuscript. Thanks also to Drs John Tait and Brad Cookson for helpful suggestions. This work was supported by NIH grant DK 48800 to DES.
*To whom correspondence should be addressed: Tel: +1 206 548 6833; Fax: +1 206 548 6189; Email: dsabath@u.washington.edu
REFERENCES
Return


