ABSTRACT
Utrophin is a ubiquitously expressed cytoskeletal protein which is an important
structural component of the mammalian neuromuscular junction. It shows
extensive sequence similarity to dystrophin leading to postulation that
utrophin may be able to compensate for the absence of dystrophin in Duchenne
muscular dystrophy (DMD) patients. In order to study the transcriptional
control of utrophin expression including its regulation at the neuromuscular
junction, and as a first step in the development of a potential DMD therapy, we
have cloned the utrophin promoter region from human and mouse. The utrophin
promoter is associated with a CpG island at the 5'-end of the gene, and sequence
analysis of the 5'-UTR reveals several Sp1 binding sites and the absence of TATA or CAAT
motifs. Transcription is initiated at one major and three minor sites. Using
deletion constructs, we have defined an active promoter region of 155 bp. The
first exon and 900 bp upstream display limited sequence conservation between
human and mouse. The core sequence TTCCGG of the N box which regulates synaptic
expression of other genes is also present and may be involved in regulating the
specific expression of utrophin at the postsynaptic membrane. This study
provides the basis for the understanding of the regulatory mechanism that
controls utrophin expression and provides the data needed to develop methods
for the upregulation of utrophin in DMD patients.
Duchenne muscular dystrophy (DMD) is an X-linked recessive disorder caused by mutations which result in the absence
or expression of mutant forms of dystrophin. Dystrophin is a 427 kDa
cytoskeletal protein expressed predominately in skeletal, cardiac and smooth
muscle, with lower levels in the brain. In normal adult skeletal muscle,
dystrophin binds to the sarcolemma by interaction with the dystrophin protein
complex (DPC) (
1
-
4
). The DPC forms an essential link between the internal cytoskeleton of the
muscle cell and the extracellular matrix. In DMD, the absence of dystrophin
results in the loss of integrity of the DPC which eventually leads to muscle
degeneration. Thus any effective therapeutic strategy for DMD must involve the
reconstitution of this protein complex. Many strategies for DMD therapy have
involved the introduction of dystrophin minigenes into muscle using viral
vectors or direct injection (
5
). However the efficiency of gene delivery using these methods is relatively
inefficient. An alternative approach is to search for related genes which might
compensate for the loss of dystrophin.
Utrophin is an autosomally-encoded protein displaying a high degree of sequence similarity to
dystrophin (
6
). The differences in the function of utrophin and dystrophin may lie within
their regulatory sequences rather than the primary coding sequence. Although
utrophin is expressed in muscle, its overall expression pattern differs from
dystrophin. Dystrophin expression is restricted to adult muscle and brain
whereas utrophin is widely expressed. In normal adult skeletal muscle, utrophin
is localised to the neuromuscular junction (NMJ) and myotendinous junction,
however in dystrophin-deficient muscle, utrophin is also localised at the sarcolemma and co-purifies with components of the DPC (
7
). In some inflammatory myopathies, both dystrophin and utrophin can be found
localised to the sarcolemma of mature fibres (
8
,
9
). In normal foetal muscle, utrophin localises at the sarcolemma and utrophin
levels decrease as dystrophin levels increase during development, suggesting
that the two proteins are co-ordinately regulated during muscle development (
10
,
11
). The fact that utrophin can in certain circumstances be localised at the sarcolemma suggests that there may be conditions under which utrophin could be relocated
to the muscle membrane in DMD patients.
In normal adult muscle, utrophin co-localises with agrin-induced acetylcholine receptor (AChR) clusters at the NMJ (
12
). It is widely assumed that utrophin links the extracellular agrin-bound DPC to the submembranous actin cytoskeleton (
2
). The role of utrophin at the NMJ may be to stabilise the mature AChR clusters
at the postsynaptic membrane, rather than in the initial stages of AChR cluster
formation (
13
). Although utrophin may be required for the normal development and functioning
of the postsynaptic membrane, nothing is known about the mechanism that
controls the specific localisation of utrophin at the NMJ or the regulation of
utrophin expression.
As yet, no promoter has been characterised for the utrophin transcript. If
upregulation of utrophin to replace dystrophin does become a feasible strategy
for DMD therapy, then a full characterisation of the utrophin promoter is
essential. Here we describe the isolation and characterisation of a utrophin
promoter associated with the CpG island at the 5'-end of the gene which drives the expression of the full-length transcript and contains a sequence motif which may
drive expression specifically at the NMJ.
YAC 4X23E3 was partially digested with the restriction endonuclease
Mbo
I, ligated to [lambda]GEM-11
Xho
I arms and packaged using the Packagene system (Promega). The mouse genomic
phage library screened to isolate the mouse utrophin promoter was kindly
supplied by Dr D. Picketts, Institute of Molecular Medicine, Oxford. Positive
hybridising phage were further subcloned into pGEM-7Zf(+) plasmid (Promega). Sequencing of double stranded plasmid DNA was
performed using Sequenase v2.0 (USB).
Extension assays were performed using the AMV Reverse Transcriptase Primer
Extension System (Promega). Primer U25 (5'-AGG CAC CAA CTT TGC CAA ACGC, 1067-1088) was end-labelled by kination in the presence of [[gamma]-
32
P]ATP, 3000 Ci/mmol (ICN). Primer (100 fmol) was annealed to total RNA (30-60 [mu]g) at 58oC for 20 min then extended at 42oC for 30 min. The products were separated on a 6%
polyacrylamide gel under denaturing conditions. A sequencing ladder was run
simultaneously for sizing extension products.
5' RACE was carried out using the 5' AmpliFINDER RACE kit (Clontech). IN157 poly(A)
+
RNA (2 [mu]g), prepared using Dynabeads mRNA purification kit (Dynal), was reverse
transcribed using primer U25. The anchor-ligated cDNA was then PCR amplified using the anchor primer and an
internal primer, U24 (5'-AAT CGG CTT CTG GAG CCA GAG, 982-1002). PCR amplified products were cloned into pGEM-T vector (Promega) and sequenced using Sequenase v2.0
(USB).
Genomic fragments used as RNase protection probes were cloned into pGEM-7Zf(+) (Promega). The C-terminal control probe for mouse utrophin and the phosphoglycerate
kinase human probe were kindly provided by Dr D. Blake, Genetics Laboratory,
University of Oxford, and Dr J. Firth, Institute of Molecular Medicine, Oxford,
respectively. [[alpha]-
32
P]CTP labelled antisense RNA was generated according to the conditions described
for Riboprobe
in vitro
Transcription Systems (Promega). Antisense RNA (5 * 10
6
c.p.m.) was co-precipitated with total RNA (10 [mu]g human cell line RNA and 20 [mu]g mouse lung RNA), then hybridised and digested using the RPAII
Ribonuclease Protection kit (Ambion). The protected products were separated on
a 6% polyacrylamide gel under denaturing conditions.
32
P-kinased,
Hae
III digested [Phi]X174 DNA markers were run simultaneously to estimate the size of the
protected fragments.
IN157, HeLa, COS-7 and C2 cells were cultured in Dulbecco's modified Eagle's medium (DMEM)
supplemented with 10% foetal calf serum (FCS). Cells were grown to 80-90% confluency, trypsinised, washed once with phosphate buffered saline
(PBS) and resuspended (3.75 * 10
7
cells/ml) in ice cold PBS. Transient transfections were carried out using 30 [mu]g luciferase reporter construct (promoter fragment cloned into pGL2-basic; Promega) and 10 [mu]g pSV-[beta]-galactosidase plasmid (Promega). Cells (3 * 10
6
) were electroporated at 250 V, 960 [mu]F using a BioRad Gene Pulser. Following electroporation, the samples were
placed on ice for 1 min and aliquoted into a 60 mm diameter Petri dishes
containing 5 ml of DMEM-10% FCS. Cells were allowed to recover at 37oC for 12 h (HeLa), 24 h (IN157 and COS-7) or 48 h (C2 myoblasts). Cells were washed twice with PBS,
harvested into 400 [mu]l Reporter Lysis Buffer (Promega). Cell extract (20 [mu]l) was mixed with 100 [mu]l Luciferase assay reagent (Promega) and light production quantified
using a Turner Designs Model 20 luminometer. [beta]-Galactosidase activity was measured using an enzyme assay system
(Promega) then analysed using a microplate reader (BioRad) at 420 nm.
Yeast artificial chromosome (YAC) clone 4X23E3 encompasses a CpG-island located at the 5'-end of the human utrophin gene (
14
). The YAC was subcloned into phage and the resultant library screened with a
0.6 kb
Eco
RI cDNA fragment containing the first exon of utrophin. Positive hybridising
phage were further subcloned resulting in a plasmid clone (pPU1) containing a
hybridising 9.4 kb
Bam
HI genomic fragment. Subsequent hybridisation of restriction digests of pPU1
using the 0.6 kb probe identified a 1.25 kb
Hin
dIII fragment which was subcloned and sequenced (pHH). To ensure the integrity
of the cloned region surrounding exon 1, the 9.4 kb genomic clone and normal
human genomic DNA were analysed by restriction digests and hybridisation. The
sizes of hybridising fragments within the 9.4 kb
Bam
HI clone were found to be identical to corresponding genomic fragments (data not
shown). The sequence of the 1.25 kb
Hin
dIII clone (pHH), encompassing the first exon and 0.9 kb of upstream sequence,
in addition to ~400 bp of downstream 3' sequence, is shown in Figure
1
.
Different levels of utrophin expression were observed in human cell lines by
RNase protection using a probe derived from the C-terminus. Utrophin was shown to be expressed at differing levels in the
cervical epithelial HeLa cell line, adult kidney CL11T47 cell line, adult
myoblast primary culture (M429) and the rhabdomyosarcoma IN157 cell line, from
which utrophin was originally cloned. The highest levels were detected in IN157
and CL11T47 cells (Fig.
2
).
To test whether a promoter is located in the 1.25 kb
Hin
dIII fragment, which contains 900 bp of 5' flanking sequences and the first 350 bp of exon 1, the fragment was
cloned in both orientations into a promoterless luciferase expression vector.
These two constructs, pHH[middot]F and pHH[middot]R (forward and reverse orientation respectively), were
transiently transfected into human HeLa cells, human rhabdomyosarcoma IN157
cells, mouse C2C12 myoblasts and monkey kidney COS-7 cells. An SV40-driven [beta]-galactosidase reporter plasmid was cotransfected with
the luciferase test constructs, and the values obtained were normalised using
the [beta]-galactosidase activity to correct for transfection efficiency.
Luciferase activity was detected only in extracts from cells transfected with
the
Hin
dIII fragment in the forward orientation, construct pHH[middot]F (Fig.
5
). The absence of luciferase expression for cell extracts transfected with the
construct in the opposite orientation (pHH[middot]R) suggests orientation-dependent regulation of luciferase expression, which is indicative
of a promoter element. Similar results were observed in all of the above cell
lines which suggests that the putative human utrophin promoter element is
active in different species and that the DNA elements that regulate utrophin
expression may be conserved. The high level of luciferase expression detected
using COS-7 cells is most likely accounted for by the endogenously expressed large T
antigen resulting in rapid replication of the luciferase reporter constructs
which contain the SV40 origin of replication (
20
).
Figure
To delineate the promoter activity of the
Hin
dIII fragment further, several deletions were made using the indicated
restriction enzyme sites (Fig.
6
A). IN157 cells were transiently transfected with the panel of constructs and
cell extracts were assayed for luciferase activity. The data shown in Figure
6
B are averages of three separate transfections for each construct and the
activity of each construct is represented relative to the activity of the 1.25
kb
Hin
dIII construct, pHH[middot]F.
Figure
The removal of the region between nucleotides 743 and 948 results in a loss of
transcriptional activity. This observation confirms the presence of a promoter
element within this region, and, taken with the findings from the primer
extension, RNase protection and RACE analysis, delineates the promoter element
within a 155 bp region between nucleotides 743-897.
Deletions of other regions within the 5' flanking sequence, including the conserved E-box (nucleotides 497-503) and TTCCGG motif (nucleotides 591-596), do not affect appreciably the activity of the
utrophin promoter in IN157 cells. However, this could be explained by the
absence of appropriate transcriptional factors within the IN157 cells that may
be capable of interacting with the DNA motifs.
We have isolated and characterised a genomic fragment which contains the first
exon and 5' flanking sequence of human and mouse utrophin. The human fragment is
active in initiating transcription of a reporter gene in various cell lines,
indicating that the utrophin promoter element may be conserved. A series of 5'-deleted fragments of the human utrophin upstream flanking region
were generated to determine the minimal promoter element. Provided the 155 bp
region containing the promoter element was intact, deletions in the 5'-end did not significantly alter the transcriptional activity.
Hence, the CpG-rich 155 bp region characterised here functions as a basal promoter
element, driving utrophin transcription in many cell types.
The human and mouse utrophin promoter regions have several putative Sp1 binding
sites and are devoid of TATA or CAAT motifs. By primer extension, 5' RACE and RNase protection analysis, we have located several putative
start sites for the full-length utrophin transcript. Although we cannot rule out the possibility
that these additional products arose due to premature termination by reverse
transcriptase during primer extension and RACE analysis, these results are
consistent with the observation that CpG-rich TATA-less promoters of widely expressed genes usually contain several
transcription initiation sites spread over a fairly large region, rather than
at a single base position (
21
). Multiple clustered start sites have also been described for the
acetylcholinesterase gene (
22
), the human N-CAM gene (
23
), the AChR [alpha]-subunit gene (
24
) and for the dystrophin brain-specific full-length transcript (
25
).
Genes with CpG-rich promoters were initially considered to express proteins with a
`housekeeping' function in the cell. However, several genes devoid of TATA or
CAAT motifs in their promoter regions have been shown to encode proteins that
are highly regulated (
22
,
26
). Expression of the acetylcholinesterase gene, which also has a CpG-rich promoter, is regulated during muscle cell differentiation and is
localised specifically at the NMJ (
22
). The typical `housekeeping' promoter of the Dp71 dystrophin isoform drives
expression in specific cell types (
18
). Utrophin, although expressed in all tissues, also appears to be regulated in
different cell types. For example, there are relatively abundant levels in
adult lung and higher levels in foetal muscle compared to adult skeletal muscle
(
27
,
28
). Utrophin transcripts are also specifically localised during development, with
initial accumulation in the neural tube and later becoming abundant at a
variety of other sites such as the tendon primordia in the digits, the
pituitary, thyroid and adrenal glands, cardiac muscle, and the kidney and lung
(
29
). Taken together, these observations suggest that, although utrophin is
expressed widely, there is also developmental and tissue-specific regulation of expression in certain tissues. We have shown here
that the utrophin transcript is detected at different levels in several human
cell lines and thus specific transcription factors may regulate the different
levels of utrophin expression in the various cell types.
Several putative DNA motifs identified in the 5' flanking sequence may be involved in the control of utrophin muscle
expression. We identified a conserved E-box, which is a binding site for helix-loop-helix proteins of the MyoD1 family, including MyoD1 (
30
), myogenin (
31
,
32
), myf5 (
33
) and MRF4 (
34
). E-box motifs are found in the promoters of many muscle-specific genes, and enhance the
in vitro
transcriptional activity of the [alpha], [beta] and [gamma] AChR subunit genes (
24
,
35
-
37
). Given the co-localisation of utrophin with AChRs at the NMJ, it would be of interest to
determine whether myogenic factors regulate the expression of utrophin by
interaction with this conserved E-box motif. The human and mouse utrophin 5' flanking region contain the core sequence of the N box, an element
shown to direct synapse-specific expression of the mouse acetylcholine receptor [delta]-subunit gene (
17
). This TTCCGG motif restricts the expression of the [delta]-subunit gene to the NMJ by enhancing expression at the endplate and
acting as a silencer in extrajunctional areas. Sequences identical to this core
sequence of the N box are present in other AChR subunits and it is likely that
this element regulates the synaptic expression of at least some of these genes
(
17
). The position of the N-box core sequence was not necessarily conserved between the same AChR
subunit gene in different species nor between different subunit genes of the
same species (
17
), and thus the presence of the N-box motif at different sites in the human and mouse utrophin sequence may
be consistent with this observation. The mRNA levels of N-CAM, 43K-rapsyn and s-laminin (
38
) were shown by
in situ
hybridisation to be concentrated at the synaptic sites. By database searching,
we have determined that the core sequence of the N box is present in the 5' flanking sequence of [beta]2-syntrophin, which is also localised specifically to the NMJ,
whereas it is absent from the upstream sequence of [alpha]1-syntrophin, which is localised at the sarcolemma (
12
). We also determined that the element is absent from the sequences upstream of
the brain and muscle-specific dystrophin promoters. This suggests that there may be a general
mechanism for selective transcription by synaptic nuclei and this may involve
the interaction of a transcription factor(s) capable of recognising the N box
sequence.
We have used PCR amplification of single-stranded cDNA synthesised from a range of mouse tissues (brain, liver,
lung, heart, skeletal muscle, kidney, small intestine, spleen and eye) to
demonstrate that the untranslated first exon of the utrophin gene is utilised
in all mouse tissues analysed (data not shown), giving preliminary evidence
that the promoter element is active in these tissues. This does not exclude the
possibility that there are other full-length isoforms of utrophin derived from alternate promoters, with
distinct, and possibly overlapping, patterns of expression. The dystrophin gene
has at least three promoters at the 5'-end of the gene which direct the tissue-specific expression of the full-length isoforms (
39
-
42
). However, in contrast to dystrophin, the utrophin gene has a CpG island. The
differences in genomic organisation at the 5'-end of these genes may reflect a different mechanism for regulating
expression of the full-length transcripts. Thus there may only be one promoter at the 5'-end of the utrophin gene, as characterised here, and the
control of the widely expressed full-length transcript may be regulated via tissue specific transcription
factors binding to as yet uncharacterised elements.
In summary, the CpG-rich utrophin promoter contains several Sp1 binding sites and has motifs
that may direct muscle and synapse-specific expression. The promoter directs transcription from multiple
start sites. No differing first exon has as yet been identified, suggesting
that the utrophin promoter may be a single basal element that is active in all
tissues with specialised expression controlled by cell-specific transcription factors. The identification of the elements that
regulate the restricted expression of utrophin, by enhancing expression at the
NMJ and silencing in extrajunctional areas, may provide insight into the
potential for up-regulating utrophin as a therapeutic strategy in DMD.
This work is supported by research grants from the Medical Research Council, the
Muscular Dystrophy Group of Great Britain and Northern Ireland and the Muscular
Dystrophy Association USA. C.D. gratefully acknowledges the support of the Sir
Robert Menzies Memorial Trust and the Overseas Research Students Awards Scheme.
We thank our colleagues at Cold Spring Harbor Laboratories and Oncogene Science
Inc. for helpful discussions and also Derek Blake for critical reading of the
manuscript.


REFERENCES
Return

