DDBJ/EMBL/GenBank accession nos U50908-U50912
ABSTRACT
The XIST gene in both humans and mice is expressed exclusively from the inactive X chromosome and is required for X chromosome inactivation to occur early in development. In order to understand transcriptional regulation of the XIST gene, we have identified and characterized the human XIST promoter and two repeated DNA elements that modulate promoter activity. As determined by reporter gene constructs, the XIST minimal promoter is constitutively active at high levels in human male and female cell lines and in transgenic mice. We demonstrate that this promoter activity is dependent in vitro upon binding of the common transcription factors SP1, YY1 and TBP. We further identify two cis-acting repeated DNA sequences that influence reporter gene activity. First, DNA fragments containing a set of highly conserved repeats located within the 5'-end of XIST stimulate reporter activity 3-fold in transiently transfected cell lines. Second, a 450 bp alternating purine-pyrimidine repeat located 25 kb upstream of the XIST promoter partially suppresses promoter activity by ~70% in transient transfection assays. These results indicate that the XIST promoter is constitutively active and that critical steps in the X inactivation process must involve silencing of XIST on the active X chromosome by factors that interact with and/or recognize sequences located outside the minimal promoter.
X chromosome inactivation results in random transcriptional inactivation of one of the two X chromosomes present in normal, female mammalian cells. This process allows mammals to achieve dosage equivalence of most X-linked genes between females, who normally have two X chromosomes, and males, who normally have one X chromosome. The major genetic locus proposed to control the X chromosome inactivation process is the X inactivation center (XIC). XIC is defined as a region of the X chromosome from which a currently ill-defined inactivation signal exerts its effect in cis along the chromosome; derivative X chromosomes lacking this XIC are unable to become inactivated (1 -5 ). Human XIC has been localized to a <1 Mb region within band Xq13.2, while murine Xic maps to the homologous location on the murine X chromosome (6 -8 ). A second genetic locus known to influence the X inactivation process in mice is the X chromosome controlling element (Xce; 9 ). Different alleles at the Xce locus influence the degree of randomness of the X inactivation process (10 ,11 ). Localization of the Xce locus to within the Xic region in mice has led to the speculation that Xce and Xic are synonymous loci (12 ,13 ).
The XIST gene, whose product is a non-coding nuclear RNA, has been implicated strongly in the process of X chromosome inactivation due to its map location within XIC and its unique inactive X-specific transcription pattern (14 ). Expression of the XIST gene is tightly correlated with the presence of an inactive X chromosome and XIST transcripts are found closely associated with the inactive X chromosome in interphase nuclei (15 ,16 ). Transcripts from the Xist gene in mouse are found at high levels a full day before X inactivation is believed to occur in murine development, an observation that is consistent with XIST/Xist having an initiating role in the X inactivation process (17 ). Recently, a targeted deletion of the Xist gene was created in murine embryonic stem (ES) cells. The X chromosome carrying the mutant Xist allele was unable to be inactivated, providing direct evidence that expression of the Xist gene is necessary for X inactivation to occur in ES cells as well as in chimeric mouse embryos (18 ). Notwithstanding their apparent inability to inactivate the X chromosome carrying the mutation, cells carrying the targeted Xist allele appeared to carry out early steps in the inactivation process normally, i.e. both recognition of the number of X chromosomes present and random choice of which X was to become inactive were unaffected by the Xist mutation. Thus the deletion, which comprised part of the Xist promoter and part of the first exon, does not affect these steps. These results imply a spatial separation between sequences responsible for different steps in the initiation of X inactivation (19 ).
Transgenic mice have been created in several laboratories in which yeast artificial chromosomes (YACs) containing portions of the Xic region were integrated into ectopic sites in the murine genome (20 -23 ). While it was possible to achieve expression of the Xist gene in some instances, this did not always result in detectable transcriptional silencing of the host chromosome. In fact, evidence for Xist expression and inactivation of the host autosome has been presented for only a single multicopy transgene (22 ,23 ). In contrast, transgenic cell lines containing two to eight copies of a Xist cosmid integrated into autosomes were able to both express Xist from the transgene and repress transcription of a reporter gene in cis (24 ). Similarly, both XIST/Xist expression and spread of inactivation are readily observed when intact XIC/Xic is involved in a translocation with autosomal material (25 ). One possible interpretation of these seemingly contradictory results is that the Xist gene is subject to complex regulatory mechanisms requiring sequences that either are not present or are not maintained in a proper context in the transgenes in some studies.
Combined, the available data indicate that XIST/Xist expression and accumulation of XIST/Xist RNA are involved in the initiation of X chromosome inactivation. Further, different levels of steady-state Xist RNA have been reported in mice and in differentiated ES cells carrying different alleles at the Xce locus (26 ,27 ), suggesting a possible link between the Xce locus and the Xist transcriptional regulatory elements. Thus, characterization of the XIST promoter should provide insights into the nature of XIST transcriptional regulation, initiation of X inactivation, the nature of any interaction between the promoter and other sequences within Xce and/or XIC and the identities of other factors involved in the X inactivation process.
In order to understand the transcriptional regulation of XIST, it is necessary to first determine what factors are required for its transcription and then to identify other elements that influence the ability of the transcriptional machinery to identify the promoter and initiate transcription. Towards this end, in this paper we describe identification and characterization of the XIST minimal promoter. We have identified binding sites for common transcription factors within the minimal promoter sequence and, in addition, describe two cis-acting sequences that modulate minimal promoter activity.
Screening of human [lambda] phage libraries, PCR, cloning and isolation of primate DNAs were carried out as described (15 ,28 ). A murine genomic [lambda] phage clone was obtained from a genomic library constructed from a YAC containing the murine Xist locus (YAC 4B-2, a gift of Dr Phil Avner). DNA was prepared from a yeast culture containing the Xist YAC and partially digested with MboI to provide DNA in the range 10-20 kb. DNA was ligated into predigested, phosphatased [lambda] DASH II vector arms (no. 246211; Stratagene) and packaged using Gigapack II Gold packaging extracts (no. 247612; Stratagene) according to the manufacturer's instructions. Approximately 106 phage were screened with the murine cDNA probe (28 ) at a final wash stringency of 0.1% SDS, 0.1* SSC at 65oC. Hybridizations were carried out at 65oC in a hybridization solution of 10% dextran sulfate, 1 M NaCl, 1% SDS.
The lepine cDNA clone was isolated from a female rabbit liver (no. TL 1006a; Clontech) cDNA library generated by oligo(dT) priming. One lepine cDNA clone was obtained by screening ~107 primary plaques with the 5'-most human XIST cDNA probe (Hbc1a) at a final wash stringency of 0.5% SDS, 50 mM Tris-HCl, pH 8.6, and 0.5 M NaCl at 65oC. An overlapping lepine [lambda] genomic clone was obtained by screening a rabbit genomic library (no. TL1008j; Clontech) at a final wash stringency of 0.1% SDS and 0.1* SSC at 65oC with the lepine cDNA clone. An equine genomic [lambda] clone was isolated from a male horse library (no. 946701; Stratagene) with the Hbc1a probe as described above. Sequences homologous to the human XIST 5'-region were subcloned and sequenced.
Nucleotide sequence of cDNAs was determined on double-stranded templates using vector- and gene-specific primers as described (15 ,28 ), in most instances using an Applied Biosystems fluorescence sequencer (ABI model 373A or 377, with V1.1.1 or V1.2 sequence analysis software). Contig assembly and sequence analysis was performed using either the GeneWorks (Intelligenetics) or LaserGene (DNAStar) DNA analysis software. Sequence comparisons were performed using the GeneWorks DNA alignment program. Database searches were performed using the BLAST network service at the National Center for Biotechnology Information (NCBI). The GRAIL 2 and XGRAIL v1.2 programs were used to evaluate protein coding potential (29 ).
The female embryonic kidney cell line (293) and male fibrosarcoma cell line (HT 1080) were purchased from ATCC (CRL-1573 and CRL-7951 respectively). Promoter elements were cloned into GeneLight vectors (Promega). Approximately 105 cells were transiently transfected using 1 [mu]g plasmid and 1.5 [mu]g lipofectin in a total volume of 250 [mu]l serum-free medium overnight. Medium was changed after ~16 h and cells were harvested after 48 h using Cell Lysis Reagent (Promega) according to the manufacturer's instructions. Aliquots of 20 [mu]l cell lysate were used to measure luciferase activity by addition of 100 [mu]l Luciferase Assay Reagent (Promega), followed by luminescence quantitation in a TD-20e luminometer (Turner Designs).
All transfections were carried out in duplicate. Each luciferase reading was normalized to the average minimal promoter activity for each experiment. Two tailed t-tests and other statistical computations were done using software supplied with Microsoft Excel v. 5.0.
PCR conditions and selection of PCR primers was as described (28 ). Each mutagenesis construct was created by making two primers facing in opposite directions whose 3'-ends lie six bases apart. An EcoRI linker was added to the 3'-ends of each primer. PCR was carried out with the mutagenesis primer in conjunction with G7R (GAAGTTGTGACTCCTGGTCT) for the 5'-facing primers or G10R (GAGAGATCTTCAGTCAGGAAG) for the 3'-facing primers. G7R contains an XbaI site, while G10R contains a BglII site. The two PCR products were co-precipitated and then resuspended in Universal restriction enzyme buffer (Stratagene). The reactions were then digested with EcoRI, XbaI and BglII simultaneously. Digestion products were then co-precipitated with plasmid pGLB which had been digested with NheI and BglII. Digestion products were then ligated together at 16oC overnight. The ligation reaction was transformed into DH5[alpha] (BRL) or One-Shot competent cells (Invitrogen) in the presence of ampicillin. Ampicillin-resistant colonies were picked into 150 [mu]l TB/Amp in a 96-well plate and grown overnight. Samples of 2 [mu]l of culture were PCR amplified with vector primers GLP1 (TGTATCTTATGGTACTGTAACTG) and GLP2 (CTTTATGTTTTTGGCGTCTTCCA), digested with EcoRI and electrophoresed on 2% agarose gels. Positive colonies were further analyzed by sequencing.
The pGLXB construct was subjected to Bal31/exonuclease III directional deletions according to the manufacturer's instructions (Stratagene) to generate the -129 and -72 constructs.
PCR products or double-stranded oligonucleotides (SP1, TFIID, YY1 and SP1 mutant oligonucleotides; Santa Cruz Biosystems) were end-labeled with [[gamma]-32P]dATP using polynucleotide kinase (New England Biolabs) and purified using Nuc-Trap columns (Stratagene) according to the manufacturer's instructions. Approximately 104 c.p.m. (100 pg) labeled oligonucleotide were incubated with 5 [mu]g HeLa nuclear extract (Promega) or recombinant SP1 protein (Promega) in a final concentration of 10 mM Tris, pH 8, 5 mM MgCl2, 1 mM CaCl2, 2 mM DTT, 50 [mu]g/ml BSA, 2 [mu]g/ml sonicated herring sperm DNA, 100 mM KCl, 10% glycerol and 0.3 [mu]g/ml poly(dI[middot]dC) at room temperature for 30-60 min. Binding reactions were loaded onto 6% non-denaturing polyacrylamide gels electrophoresed in 0.5* TBE at room temperature. Gels were transferred to Whatman paper, dried and exposed to X-ray film. Bandshifts with probe 2 (L/S12 region) were carried out as described above in a binding buffer consisting of 20 mM Tris, pH 8, 2 mM DTT, 80 mM KCl, 10 mM MgCl2, 10% glycerol. Non-denaturing acrylamide gels and electrophoresis buffer contained 0.02% NP-40 and 4 mM MgCl2. These gels were electrophoresed at room temperature or 4oC. Antibodies (polyclonal anti-Sp1, no. sc-59; polyclonal anti-TBP, no. sc-204; monoclonal anti-TBP, no. sc-421; polyclonal anti-YY1, no. sc-281; polyclonal anti-USF, no. sc-229; Santa Cruz Biotechnology) were added to the binding reactions in supershift experiments and incubated at room temperature for at least 30 min before electrophoresis.
Transgenic mice were created at the Transgenic Mouse Facility in the Department of Genetics at Case Western Reserve University by pronuclear injection. Transgenic embryos were harvested at E9.5-E13.5 in ice-cold PBS and a portion of each embryo was then removed for genotype analysis by PCR (30 ) using transgene-specific primers [transgenes XH and HH, primers G10 (CTTCCTGACTGAAGATCTCTC) and GLP2 (see above); transgene G6H6, primers G6 (TACTCTTCCACTCACTTTTC) and H6 (AGAGAGTGCAACAACCCACA)] and primers for the Sry gene to determine the gender of each embryo (Sry1, GATCAGCAAGCAGCTGGGAT; Sry2, TTTGGGTATTTCTCTCTGTG). The remainder was homogenized in Cell Lysis Reagent (Promega). The homogenate was centrifuged for 2 min at maximum speed in a microcentrifuge and 20 or 60 [mu]l supernatant were assayed for luciferase activity as described above. Total protein concentration of supernatant was determined using a Bradford assay according to the manufacturer's instructions (BioRad).
Sequences described here have been deposited into the GenBank sequence repository. The accession numbers for the human, mouse, rabbit and horse XIST/Xist sequences are U50908- U50911 respectively and that for the sequence containing the PuPy repeat is U5091208.
Within the segments that showed clear evidence of homology, comparison of the immediate upstream portions of the four sequences is shown in Figure 1 . A region of elevated conservation among the four sequences is present within ~100 bp of the transcription initiation site. The region of elevated conservation in this upstream sequence (-101 to -1) is 74, 78 and 81% identical between human and mouse, human and rabbit and human and horse respectively and is comparable with levels of identity observed in the 5'-end of the RNA itself (bases +1 to +308) for the same three comparisons (Fig. 1 ; 32 ; B.D.Hendrich, PhD thesis, Stanford University). This high degree of conservation in the immediate upstream region suggests that this sequence is important in XIST function and suggests this region as a candidate for XIST promoter sequences. The sequence alignment reveals no conserved TATA or CCAAT sequences. The human sequence does, however, contain a consensus SP1 binding sequence (33 ) located from position -49 to -54, which is completely conserved among ape and Old World monkey XIST genes (data not shown) and shows partial conservation in the murine and lepine sequences (Fig. 1 ). In addition, the conserved sequences around the transcription start site resemble the consensus binding site for the initiator protein YY1 (34 -36 ).
As determined above, expression from the XIST minimal promoter is driven by binding of common transcription factors in several different cell lines. Sequences outside this minimal promoter, therefore, may give the XIST gene its unique transcription pattern. In an attempt to identify such sequences, we cloned a series of restriction fragments from a [lambda] phage contig including the human XIST gene and extending >50 kb upstream into a reporter plasmid in which the luciferase gene is driven by the XIST promoter (construct pGLXH, Fig. 2 ). Numerous fragments tested failed to produce a significant effect on minimal promoter activity when compared with the pGLXH construct, including one containing the first intron. However, two sequences were found which did alter promoter activity significantly (Fig. 5 A).
The 5'-end of the transcribed portion of XIST contains a series of nine tandem copies of a repeated motif that comprises a promising candidate region for a functional domain within the XIST transcript because of their high degree of conservation among eutherians (15 ,27 ). The repeats, designated XCR 1-9 (for
Figure
Because the XCR sequences are present within the 5'-end of the XIST transcript, we wanted to determine whether the repeats would have any effect on transcript levels when present in the 5'-end of a reporter transcript. When tested in the 5'-untranslated portion of the luciferase reporter gene, the XCR sequences resulted in a >3-fold increase in luciferase activity (P < 0.004; Fig. 5 B). This effect was seen in the female embryonic kidney cell line (293) as well as in mouse fibroblasts (data not shown). A similar but somewhat smaller effect was observed when a larger fragment containing transcribed sequences from the 5'-end of XIST between the minimal promoter and the repeats was included in the reporter construct (P < 0.006; Fig. 5 B). As only a modest effect was observed when the repeats were cloned into the distant BamHI site of the pGLXH construct, the data indicate that the effect of the XCR sequences is strongest when part of the transcript itself, consistent with a possible post-transcriptional effect. The possibility that the repeats contain promoter activity of their own was investigated by cloning the repeats alone, in either orientation, in front of the promoterless luciferase gene; however, no luciferase activity was observed in cells transfected with these constructs (Fig. 5 B). The XCR sequences thus provide a moderate but significant enhancing activity, the effect of which is clearly strongest when present in the 5'-untranslated region.
A second region which influences minimal promoter activity in vitro was found within a 5.6 kb BamHI fragment located ~25 kb upstream of the XIST gene. This sequence caused a significant reduction in luciferase activity in transient transfections of both male and female cell lines when present on the pGLXH plasmid, while other upstream genomic fragments had no such effect (Fig. 5 C). Upon determination of the sequence of this BamHI fragment, it was found that the fragment contains a stretch of alternating purine-pyrimidine (PuPy) repeats. This sequence extends for ~450 bp with only 14 single base interruptions in the strict purine-pyrimidine repeat structure. The repeat region was PCR amplified and cloned into the BamHI site of the pGLXH promoter construct. A similar level of reduction in promoter activity was found with just the PuPy repeat as was found with the entire 5.6 kb BamHI fragment, indicating that the repeats are responsible for the observed transcriptional repression. Two SnaBI restriction sites lie within the repeat region and were used to generate constructs containing different portions of the repeat array directly upstream of the minimal promoter. A construct with 299 bp of repeat greatly inhibited luciferase activity, whereas constructs with 166 or 25 bp of PuPy repeat showed normal luciferase activity (Fig. 5 C), illustrating that >166 bp of the XIST PuPy repeat is required to promote silencing when located directly next to a promoter in this in vitro assay.
Expression of the XIST gene and accumulation of XIST RNA only occurs on the inactive X chromosome or on chromosomes that are programmed to undergo X chromosome inactivation (14 ,15 ,17 ,18 ,40 -42 ). Thus factors controlling XIST expression are directly involved in, or are influenced by, the X inactivation process. By understanding XIST transcriptional regulation, we hope to gain insights into the nature of the initiation of X inactivation. Here we report the identification and characterization of the human XIST minimal promoter and the identification of two repeat elements that modulate minimal promoter activity in an in vitro assay.
XIST RNA is detected in significant amounts only from inactive X chromosomes and not from active X chromosomes. A priori, such a pattern of differential gene expression could reflect one of two alternative possibilities (Fig. 7 ). First, the XIST promoter could be a conditional one and require transcriptional activation by factors specific to X chromosomes chosen to become an inactive X. Alternatively, XIST could be constitutively active on all X chromosomes prior to inactivation (as suggested by the studies of Panning and Jaenisch; 43 ) and require transcriptional repression on the single X chromosome in males and the active X in females. Under such a model, the high levels of XIST RNA found associated specifically with inactive X chromosomes (15 ,16 ) may reflect up-regulation of the XIST promoter from a basal, constitutive level (the `pre-inactivation state'; 44 ) and/or stabilization of XIST transcripts by factors involved in assembly of the inactive X Barr body complex. The fact that we readily detect promoter activity in a number of different cell lines as well as in transgenic mouse lines indicates that the XIST minimal promoter is constitutively capable of supporting transcription and thus must be silenced on the active X chromosome. This observation is consistent with models in which a developmental factor(s) acts to `mark' or `block' XIC on a single active X, regardless of the total number of X chromosomes present (19 ,44 ). Indeed, studies of XIST/Xist methylation in the region of the minimal promoter (28 ,45 -47 ) have implicated DNA methylation in the silencing of XIST on the active X chromosome, perhaps as part of the hypothesized `blocking' step.
Figure
Transcription factors responsible for XIST expression were identified through the use of saturation site-directed mutagenesis and gel mobility shift assays. An SP1 binding sequence centered around position -72 and the potential SP1 binding sequences at positions -50 to -43 (Fig. 1 ) show positioning expected for upstream control elements and were implicated by the mutagenesis studies (Fig. 3 ). The second identified sequence is very similar to the consensus binding sequence of the initiator protein YY1 and is located at the expected position for YY1 to bind as an activator of transcription (Fig. 1 ; 37 ); this sequence was indeed found to bind the YY1 protein in vitro (Fig. 4 D). The third sequence is located from position -26 to -31 with respect to the transcription start site, the location expected for a TATA box. Indeed, the sequence which overlaps this region, TTAAAG, is very similar to the TBP consensus binding sequence and was found to bind either TBP or some TBP-like protein in electric mobility shift assays (Fig. 4 C). A recent characterization of the murine Xist minimal promoter identified the TTAAAG sequence as also being important for promoter activity (48 ).
The transient transfection data and the results obtained in the analysis of protein interactions with the minimal promoter support the constitutively active model for the XIST promoter shown in Figure 7 . The minimal promoter directs high levels of transcription in all cell types tested using common transcription factors. Similarly, luciferase activity is readily detectable in transgenic mouse lines in which the human XIST promoter is driving luciferase expression. Thus, both the in vitro and in vivo data provide no evidence for a DNA binding factor responsible for the inactive X-specific expression characteristic of the XIST gene, as required by the conditional XIST promoter model.
These results are interesting in the light of three instances where Xist-containing sequences have been used to create transgenic mice (20 -24 ). Despite the fact that the XIST/Xist minimal promoters are constitutively active (Fig. 1 ), transgenic lines were obtained in two of these reports in which transgenic Xist expression was not detected, though the input YACs were apparently intact (20 ,21 ). The variable data on such transgenes may implicate cis-acting chromatin elements (perhaps as part of a XIC/Xic recognition element) that are not always maintained in a proper context in the transgenic lines. Both Jaenisch and colleagues (22 ,23 ) and Ashworth and colleagues (24 ) have detected Xist expression and accompanying spread of Xist RNA in transgenic ES lines. However, the multicopy nature of those transgenes complicates conclusions regarding the possible existence or location of such cis-acting elements within XIC/Xic.
We have identified two sequences capable of influencing the activity of the XIST minimal promoter in reporter gene constructs, both of which consist of repeated DNA (Fig. 5 ). The XCR sequences present ~300 bp downstream from the transcription start site act to stimulate minimal promoter activity, perhaps by a post-transcriptional mechanism. While we have been unable thus far to demonstrate the existence of protein binding to the XCR using HeLa cell nuclear extracts (unpublished data), it is conceivable that a developmentally regulated factor, present, for example, at the time of initiation of X inactivation, may bind the XCR sequences and thereby enhance XIST expression and/or stabilize XIST RNA in vivo.
The upstream PuPy repeat sequences are also implicated in control of XIST expression and cause a 70-80% reduction of promoter activity in transient transfection assays. Similar PuPy sequences at other loci have been shown to form a right handed Z-DNA structure under physiological conditions in vitro (49 -52 ). The mode of action of the PuPy sequence in inhibition of XIST promoter activity is unclear. While the sequence of the repeat is not unique, the XIST PuPy repeat is nearly twice the size of other known PuPy repeats (52 ). Should this sequence form a Z-DNA structure, it would represent a large segment of non-B-DNA and could conceivably direct formation of a repressive chromatin structure in the vicinity of the XIST gene and XIC. Further work will be necessary to determine whether either of these repeat sequences influence XIST expression levels in vivo and are therefore true cis-acting elements involved in the X inactivation process.
All X chromosomes in excess of one present in a diploid cell are inactivated. This suggests that a cell is able to recognize all XIC loci (i.e. X chromosomes) within that cell, `choose' one XIC and `block' the chosen XIC to keep that chromosome active and ensure that all remaining XIC loci inactivate the chromosomes on which they lie.
Penny et al. (18 ) have recently described experiments in which the proximal minimal promoter and entire first exon of one Xist allele were deleted in female ES cells. When cells containing this deleted Xist gene were induced to differentiate, the chromosome harboring the deleted allele could be recognized and `chosen', but could not become inactive. This demonstrates that the Xist gene is absolutely required for X inactivation to occur and implies that sequences involved in Xic recognition and choice were not affected by the deletion (19 ). Panning and Jaenisch (43 ) have observed low levels of biallelic Xist expression in undifferentiated ES cells which, upon differentiation, switched to monoallelic expression. This is consistent with the initiation step of X inactivation consisting of silencing one Xist allele and up-regulating the other and supports the constitutive promoter model (Fig. 7 ). Together with the data reported here, these two reports support the idea that there may be at least three critical and possibly distinct types of X-linked sequences involved in X chromosome inactivation; those involved in recognition and choice of one X chromosome, those involved in blocking XIST/Xist transcription from that allele on the X chosen to be the active X and those responsible for the control and/or stabilization of XIST/Xist expression on the other X (or Xs) that is/are subsequently inactivated. While the experiments reported here and elsewhere (18 ) clarify those sequences responsible for XIST/Xist transcription, they suggest that the putative XIC recognition elements lie, at least in part, outside the examined regions.
While the nature of the `blocking' step is not known, a likely candidate mechanism is differential DNA methylation. The importance of DNA methylation in silencing the Xist gene is demonstrated by the fact that mice lacking maintenance DNA methyltransferase activity show inappropriate Xist expression in somatic cells (43 ,53 ). Further, the fact that the minimal promoter and XCR sequences are differentially methylated on the active and inactive X chromosomes (28 ,45 -47 ) indicates that although this region is not important in choosing, it may be the region on which the `block' (in the form of differential DNA methylation) is imposed, thus preventing XIST/Xist expression and allowing the chromosome to remain active. The identification of a naturally occurring point mutation in the XIST promoter that segregates with non-random X chromosome inactivation in a human family (54 ) further supports the possibility that the minimal promoter region is important in the very early stages of X chromosome inactivation and may be involved in choosing between the two X chromosomes in female cells and/or establishing the blocking signal.
How might DNA methylation, the minimal promoter and cis-acting elements interact to achieve the unique expression pattern observed for XIST? SP1 has been shown to bind to its consensus site and to promote transcription irrespective of the methylation status of the promoter sequence (55 -57 ). Thus, silencing of XIST on the active X chromosome cannot be achieved through exclusion of SP1 from promoter sequences by methylation alone. Further, DNA gel mobility assays using methylated and unmethylated minimal promoter probes showed no differences in mobility shift patterns (data not shown), providing further evidence that DNA methylation itself does not affect binding of proteins to the XIST minimal promoter and, therefore, is not likely to be directly responsible for silencing of XIST on the active X chromosome in somatic cells. The XCR region is known to be differentially methylated in somatic cells (28 ; Fig. 7 ) and has a number of conserved CpG dinucleotides within the motif II and III consensus sequences (Fig. 6 ). Whether such methylated sequences might inhibit XIST transcription is unclear at present.
That expression of the XIST/Xist genes should be under the control of at least one distant, cis-acting DNA element is expected from studies of the Xce locus in mice (7 ). Alleles at Xic-linked Xce influence the randomness of X chromosome inactivation in mice as well as influencing the steady-state levels of Xist RNA in somatic cells. While Xce has been found to be genetically distinct from the Xist gene in one mouse strain (58 ), it remains possible that an interaction between the Xist promoter and cis-acting element(s) act to achieve the Xce effects. Indeed, we have previously identified strain-specific sequence variations in the murine XCR region (28 ) which could contribute to this effect. Recent work has also identified Xce allele-specific methylation differences in a region lying distal to the murine Xist gene, providing a candidate region for the Xce locus (59 ). It will be interesting to learn whether this region contains cis-acting elements that, like the PuPy repeats examined here, influence Xist minimal promoter activity. Further work on the interaction between the minimal promoter and cis-acting elements will be necessary to determine what role these sequences play in the processes of XIST transcription, Xce effects and, subsequently, X chromosome inactivation.
We thank T.Mourton for technical advice and for creating the transgenic mice at the Case Western Reserve University Transgenic Mouse Facility. We also thank Drs T.Magnuson, R.Myers and D.Setzer for useful discussions, Mary Schuler for technical advice and Dr A.Bird for useful comments on the manuscript. This work was supported in part by a research grant from the National Institutes of Health to H.F.W. (GM45441). R.M.P. was supported by a research fellowship from the Howard Hughes Medical Institute.
*To whom correspondence should be addressed. Tel: +1 216 368 1617; Fax: +1 216 368 3030; Email: hfw@po.cwru.edu
+Present address: Institute of Cell and Molecular Biology, The University of Edinburgh, Darwin Building, King's Buildings, Edinburgh EH9 3JR, UK
Conservation of promoter sequences. To identify potentially important sequences within the XIST 5'-region, sequences upstream of the murine, lepine and equine Xist genes were cloned, sequenced and compared with the corresponding human sequence. In total, 850 bp of lepine sequence, 2010 bp from the murine locus and 3838 bp of equine upstream sequences were compared with 6475 bp of human sequence 5' of the XIST gene itself. No significant sequence homology was detectable more than ~100 bp upstream of the transcription start site (data not shown). No other genes or pseudogenes were identified in any of the upstream sequences using standard database searching and gene finding programs (29 ,31 ) or, in the case of the human sequence, using RT-PCR expression analysis (data not shown).
REFERENCES
