ABSTRACT
We generated transgenic mice containing a chimeric construct consisting of the
[alpha]
-cardiac myosin heavy chain (
[alpha]
cMHC) promoter and the human renin (hRen) gene in order to target hRen synthesis
specifically to the heart. The construct consisted of three segments: (i) an
[alpha]
cMHC DNA segment including 4.5 kb of 5
'
flanking DNA and an additional 1.1 kb of genomic DNA encompassing exons I-III (non-coding) and the first two introns; (ii) a partial hRen cDNA
consisting of exons I-VI; and (iii) a hRen genomic segment containing exons VII through IX, their intervening introns, and 400 bp of 3
'
flanking DNA. This results in the formation of a 909 bp internal fusion exon
consisting of
[alpha]
cMHC, polylinker, and hRen sequences. Despite the presence of splice acceptor
and donor sites bracketing this exon, transcription of this transgene resulted
in a major alternatively spliced mRNA lacking the exon and therefore a majority
of the hRen coding sequence. Cloning and sequencing of RT-PCR products from several heart samples from two independent transgenic
lines confirmed accurate and faithful splicing of
[alpha]
cMHC exon II to hRen exon VII thus bypassing the internal fusion exon. All other
exons (
[alpha]
cMHC exons I and II and hRen exons VII, VIII and IX) were appropriately spliced. These results are consistent with the hypothesis on exon definition which states that internal exons have a size limitation.
Moreover, the results demonstrate that transgenes present in the genome at
independent insertion sites and in either a single copy or multiple copies can
be subject to exon skipping. The implications for transgene design will be
discussed.
Transgenic mice have been and continue to be important tools for the analysis of
gene regulation, developmental expression, and as models for studying human
disease. Although the most essential step in the generation of a transgenic
model is the design of the transgene construct, by and large there are no
concrete rules governing this process (reviewed in
1
). The minimal requirements for all transgene constructs include: (i) a transcriptional initiation sequence
composed of a promoter and associated regulatory sequences; (ii) the protein coding region of the gene of interest; and (iii) a
polyadenylation sequence for appropriate 3' end processing. In addition, studies have demonstrated that constructs which contain
introns are expressed more reproducibly and at a higher level than constructs
lacking introns (
2
-
4
). This either reflects a requirement for splicing for the normal maturation,
transport and stability of a mRNA or the presence of important transcriptional
regulatory sequences within introns. These hypotheses are supported by data
demonstrating that (i) a heterologous generic intron can increase the
transcriptional efficiency of a human histone H4 promoter-chloramphenicol acetyltransferase fusion transgene (
2
); (ii) expression of the rat growth hormone gene under the control of the mouse
metallothionein promoter requires endogenous growth hormone introns in their
normal position or a heterologous intron between the promoter and growth
hormone gene (
3
); (iii) expression of the p53 gene under the control of the SV40
promoter/enhancer requires endogenous p53 introns (
5
); and (iv) the presence of enhancers or silencers within introns of numerous
genes are necessary for appropriate transcriptional regulation of many
promoters in transgenic mice (
4
,
6
,
7
).
Transgene constructs derived from genomic clones that include 5' and 3' flanking DNA and all associated exons and introns of a gene are often reliably expressed in multiple independent
transgenic lines if sufficient 5' flanking DNA is employed in the construct. Nevertheless, transgene
expression among lines, which differ in integration site and copy number, can
often be widely variable and disproportional to transgene copy number.
Exceptions to this have been noted in transgenic mice containing large genomic
constructs derived from cosmids, P1 clones, or yeast artificial chromosomes (
8
-
10
). However, in contrast to genomic transgenes, the level and reproducibility of expression among lines becomes less predictable when fusion transgenes consisting of the promoter from one gene and coding region of a heterologous gene are used. These types of
transgenes are often generated when one wishes to confer a highly restricted
tissue-specific, cell-specific and/or developmental-stage specific expression profile on a protein of interest.
Often the heterologous gene is encoded in the form of a cDNA lacking all intron
sequences. As such, the simple fusion of the promoter and cDNA does not always
reproducibly result in appropriate tissue-specific or high level expression in transgenic mice even if the promoter
was previously shown to direct appropriate expression of a genomic construct in
mice (
4
,
11
). This may be due to important transcriptional or other maturation functions
provided by intron sequences. The addition of a generic cassette containing the
donor and acceptor sites of a heterologous intron followed by a polyadenylation
signal has been used effectively to enhance the expression of some transgenes (
2
) but not others (
3
).
Taken together, the accurate targeting of a reporter gene or protein to a specific cell or tissue type may require the construction of a complex chimeric transgene consisting of (i) a promoter, including 5' flanking DNA, the transcriptional initiation site, and a portion of the exon/intron region downstream of the promoter
which contains regulatory elements necessary for promoter function, (ii) one or
more exons (and introns) of the heterologous gene encoding the desired protein,
and (iii) a polyadenylation signal either provided by the 3' flanking DNA of the heterologous gene or as an added cassette. Unfortunately, the increased complexity of such chimeric constructs may lead to less predictability concerning
the expression of transgenes of varying design.
We generated such a chimeric construct containing the [alpha]-cardiac myosin heavy chain ([alpha]-cMHC) promoter and the human renin (hRen) gene coding
sequence in order to specifically target the expression of hRen to cardiac
myocytes. In designing this construct we generated a large artificial internal
exon within the construct which fuses [alpha]-cMHC sequences to hRen coding sequences. This exon is surrounded by
an upstream intron from the [alpha]-cMHC gene and a downstream intron from the hRen gene. Herein, we
demonstrate high level tissue-specific expression but inappropriate splicing resulting in exon skipping
of the chimeric exon. Because of its size (909 bp) this chimeric exon
potentially violates the `exon definition' hypothesis which states that exon
sequences may be the recognition unit of splice site selection and that the
presence of internal exons >300-400 nucleotides (nt) in length could lead to either exon skipping
during splicing or the recognition and use of cryptic splice sites (reviewed in
12
). Our results suggest that the size of internal exons should be considered as
an additional factor when designing chimeric constructs for use in transgenic
animals.
The [alpha]cMHC promoter segment (
13
) was cloned (generous gift of Jeffrey Robbins, University of Cincinnati) as an
Xho
I to
Hin
dIII fragment into pBluescript II SK
-
(Stratagene, La Jolla, CA) to form the plasmid pMHC-1. This construct contains 4.5 kb of [alpha]cMHC 5' flanking DNA and a 1071 bp segment containing exons I, II
and 9 bp of exon III along with their intervening intron sequences. Initiation
of [alpha]cMHC translation normally starts further downstream within exon III. The
Hin
dIII site is the terminal restriction site in an 80 bp polylinker segment (
Not
I,
Kpn
I,
Apa
I,
Sal
I,
Xba
I,
Sac
I,
Pst
I,
Eco
RV,
Nco
I,
Hin
dIII) which was carried along with the [alpha]cMHC DNA.
The hRen cDNA/genomic chimera was generated by first cloning a
Hin
dIII to
Eco
RI fragment from the plasmid pRhR1100 (generous gift of Tim Reudelhuber,
Clinical Research Institute of Montreal, Montreal, Canada) into pBluescript II
SK
-
to form the plasmid phRen[I-V]. This fragment contains a
Hin
dIII linker placed at the -9 position relative to the translational start site for hRen. The presence of an
Eco
RI site within exon VI (first six bases) facilitated the construction. The hRen genomic segment was cloned from a
previously described genomic clone (
14
) as an
Eco
RI to
Bgl
II fragment and cloned into the
Eco
RI and
Bam
HI sites of phRen[I-V] to form the plasmid phRen[I-V-IX]. This chimeric hRen gene was then cloned as a
Hin
dIII to
Spe
I fragment into pMHC-1. All cloning junctions were confirmed by sequencing. The presence of the
splice acceptor site at the 3' end of [alpha]cMHC intron 2 and the splice donor site at the 5' end of hRen intron 6 were confirmed by sequencing.
The [alpha]cMHC-hRen transgene was purified for microinjection by digestion with
Xho
I and
Spe
I followed by gel electrophoresis and purification of the transgene DNA away
from the prokaryotic vector DNA. The transgene was purified using a SpinBind
column (FMC BioProducts, Rockland, Maine) and the DNA concentration was
estimated by agarose gel electrophoresis using standards of similar molecular weight. The purified transgene DNA was diluted to 2 [mu]g/ml in 10 mmol/l Tris-HCl pH 7.5, 0.1 mmol/l EDTA made with embryo culture certified water
(Sigma) and the concentration was confirmed using a DNA Dipstick (Invitrogen,
San Diego, CA). All mice were fed standard mouse chow and water ad libitum.
Care of the mice used in the experiments met or exceeded the standards set
forth by the National Institutes of Health in their guidelines for the care and
use of experimental animals. All procedures were approved by the University
Animal Care and Use Committee at the University of Iowa.
Transgene DNA was microinjected into the male pronucleus of 1-cell fertilized mouse embryos derived from [C57BL/6J * SJL/J]F
2
(B6SJL) as previously described (
15
,
16
). Transgenic founders were detected by PCR amplification using primer set 4
(Table
1
) of DNA samples isolated from tail biopsy samples as previously described (
15
). An 893 bp band was diagnostic of the presence of the transgene. Transgenic
offspring were obtained by breeding founder transgenic mice to nontransgenic
B6SJL/J or by backcross breeding to C57BL/6J mice. All mice were obtained from the Jackson Laboratory (Bar Harbor, Maine). Transgenic offspring were differentiated from their non-transgenic littermates by PCR using the same primer set.
Dot blots were performed by first denaturing 2.0 [mu]g of genomic DNA isolated from tail biopsies with 0.4 N NaOH, neutralizing
the solution and then applying the samples to nylon-supported nitrocellulose using a dot blot manifold. Blots were crosslinked
with ultraviolet light and hybridized under standard conditions (
17
) with an [[alpha]-
32
P]dCTP random primer labeled complete hRen cDNA probe. Blots were washed
stringently (0.1* SSC, 0.1% SDS at 65oC) and exposed to X-ray film overnight.
Total tissue RNAs were isolated by homogenization in guanidine isothiocyanate
followed by phenol emulsion extraction at pH 4.0 using a modification of the
method previously described (
18
,
19
). Homogenizations were scaled up to 2.5 ml to increase RNA yield and quality.
Twenty micrograms of total tissue RNA was separated on 1.5% agarose
formaldehyde gels and transferred to nylon-supported nitrocellulose as previously described (
17
). Blots were hybridized to either a 5' hRen cDNA or 3' hRen cDNA probe (Fig.
1
) which were labeled by generating a single-stranded antisense RNA. The 3' hRen probe has been described previously (
14
). The 5' cDNA probe was cloned as a
Hin
dIII to
Eco
RI fragment into pGEM-3 (Promega, Madison, WI) and was derived from exons I-V of the hRen cDNA. To insure the specific detection of hRen transcripts, northern blots were treated with 1.0 [mu]g/ml RNAse A in 2* standard saline citrate for 15 min at room temperature.
We have previously demonstrated that this procedure removes non-specific hybridization of single stranded RNA probes (
14
).
For reverse transcriptase polymerase chain reaction (RT-PCR), 10 [mu]g of total heart RNA was treated with 2.0 U RQ1 DNase (Promega) in 40 mM Tris-HCl pH 7.9, 10 mM NaCl, 6 mM MgCl
2
, 10 mM CaCl
2
and 200 U RNasin RNase inhibitor (Promega) for 30 min at 37oC. The RNA was extracted twice with phenol-chloroform-isoamyl alcohol
(25:24:1, v/v/v) and once in chloroform-isoamyl alcohol (24:1 v/v) followed by
ethanol precipitation to inactivate the DNase prior to RT-PCR. RT-PCR was performed by mixing 1.0 [mu]g of total heart RNA in reverse transcriptase buffer
containing, 1.0 mM nucleotide triphosphates, 10 mM dithiothreotol, 200 U
RNAsin, 200 ng random hexanucleotides and 200 U MMLV reverse transcriptase in a
20 [mu]l volume. Reverse transcriptase was left out of control reactions. The
reactions were incubated at 25oC for 10 min, 42oC for 1 h, 95oC for 5 min, and then were immediately placed on ice. One to five
microliters of the RT reaction was used for PCR amplification. For PCR, the
reactions were scaled to 100 [mu]l containing 1*
Taq
buffer, 2.5 mM MgCl
2
, 200 [mu]M nucleotide triphosphates, 20.0 pmol of forward and reverse primers (see Fig.
1
) and 5.0 U
Taq
polymerase (Boehringer Mannheim or Perkin-Elmer). Amplification was carried out under oil in a Perkin-Elmer 480 or MJ Research PTC-100 thermal cycler using the following thermal profile: 94oC for 1 min, 55oC for 1 min and 72oC for 2 min for 30 cycles. The 94oC incubation was increased to 5 min for the first cycle and the
extension was increased to 5 min for the last cycle. PCR products were visualized by ethidium bromide staining
after agarose gel electrophoresis and were cloned into the pCRII vectors using
the TA cloning kit and methodology recommended by the vendor (Invitrogen). DNA sequencing was either performed using the Sequenase Kit (United States Biochemical) or by automated sequencing using the University of Iowa DNA
Core Facility.
The primers for the PCR analysis were: M1-CAGAGATTTCTCCAACCC, M2-CACTGTGGTGCCTCGTTCCAG, R3-TGACACTGGTTCGTCCAATG, R4-ATAGCGGAGGGTAGTTCTG, R7-GGGTCATCCACCTTGCTC, R9-CTTTCGGATGAAGGTGGC.
We generated a construct consisting of the [alpha]cMHC promoter fused to the hRen gene in an effort to specifically target
hRen production to the heart in order to examine the pathophysiological
consequences of elevated heart-specific activity of the renin- angiotensin system. When designing this
construct we were conscious of the need to include a highly tissue-specific and developmentally-regulated promoter to target expression to the desired tissue and
cell types. We used a 5.6 kb segment of the [alpha]cMHC gene containing 4.5 kb of its 5' flanking DNA and 1071 bp of contiguous genomic DNA terminating at
position 9 of exon III (Fig.
1
). This segment has previously been shown to target myocardial-specific expression of a reporter gene in adult animals and to mimic the developmental expression of the endogenous [alpha]cMHC gene (
13
). The hRen gene segment consisted of a cDNA encoding exons I-V fused to a genomic sequence encoding exons VI-IX. This allowed us to reduce the size of the hRen gene which
spans 14 kb on human chromosome 1 to ~3 kb while retaining several spliceable introns and a viable 3' end for poly A addition (Fig.
1
). We felt this strategy to be important since the presence of introns has been
previously reported to be important for high level expression of transgenes in
mice (
2
-
4
), and fusion transgenes containing endogenous introns in their normal position
of the protein coding portion of the transgene are reproducibly expressed while
similar constructs containing cDNAs are not (
3
,
5
). Moreover, we previously demonstrated that a hRen genomic construct including all exons, introns and 400
bp of 3' flanking DNA was expressed in tissue-specific, cell-specific and developmentally regulated fashion in transgenic
mice (
14
,
20
). This unique gene fusion resulted in the generation of a 909 bp internal exon
(exon 3 of the construct, Fig.
1
).
The first and arguably most critical step in the development of a novel
transgenic model is the design of the transgene construct, as it is the
construct which ultimately dictates the overall pattern of tissue-specific, cell-specific, developmentally-regulated and hormonally responsive expression observed in transgenic
animals. In theory, transgene design simply requires a choice of which protein should be
expressed and the pattern of expression desired. In practice however, this is
limited by the availability of promoters to target appropriate temporal and
spatial expression of the transgene, and the need to not only generate a
primary transgene transcript but also to process and transport a stable mature
transgene mRNA from the nucleus to the cytoplasm. Although examples of
appropriate expression of intronless transgenes in mice have been reported (
22
,
23
), it is well documented that transgenes which contain either native (
3
-
5
) or heterologous (
2
,
3
) introns are expressed more reliably than transgenes consisting merely of a
cDNA. Therefore, one could reasonably postulate that mRNA splicing may be an
important part of the maturation, transport and stability of a mRNA.
Interestingly, Brinster
et al
. (
4
) demonstrated that the rate of transcription was significantly higher in intron-containing constructs than in those lacking introns, suggesting that the intron plays an
additional role in regulating transcriptional activity. Indeed, numerous
examples of enhancers and silencers that influence expression of transgenes in
mice have been reported in introns (
6
,
7
). In addition, introns have been reported to align nucleosomes on a transcribed gene (
24
), direct nuclear matrix attachment (
25
), and cause enhanced transcript stability in response to spliceosome assembly (
26
).
Figure
Figure
In the experiments described in this report, we attempted to generate a
construct which contains as many of the features of the native [alpha]-cMHC and hRen genes as possible while eliminating the cumbersome
nature of a 14 kb section of genomic DNA containing several large introns and
numerous six base restriction sites which can complicate cloning. In doing so,
we generated a chimeric construct which retains the essential 5' flanking DNA and exon/intron regions necessary for appropriate function
of the [alpha]-cMHC promoter, and reduced the 14 kb genomic clone to approximately
3 kb by fusing a cDNA encoding exons I-V with a genomic segment encoding the remainder of the protein and 3' flanking DNA containing the polyadenylation site. Although high
level tissue-specific expression of the construct was evident in multiple independent
transgenic lines, the mRNA was inappropriately spliced and therefore lacked a majority of the protein coding region.
Vertebrate internal exons (i.e. exons which are not attached to CAP or
polyadenylation sites) average 137 bp and are rarely >300-400 bp in length (reviewed in
12
). Therefore, the chimeric exon described in these studies is above the average
size of internal mammalian exons and may be outside the size range of exons which are efficiently recognized by the splicing machinery. It is likely that the skipping of this exon is due primarily to its size
because: (i) the acceptor and donor sites on the upstream and downstream
introns, respectively, are unaltered; and (ii) upstream and downstream exons,
which are well within the size range proposed by the model, are efficiently and
appropriately spliced together. It is unlikely that our observations result
from a defect in the transgene because no gross rearrangements in transgene
structure were detected by PCR in genomic DNA from transgenic founders and
offspring, and because a small proportion of the total transgene mRNA was
appropriately processed.
The mechanism of splice site recognition in eukaryotic genes remains poorly
understood. This is largely due to the fact that sequences surrounding vertebrate splice sites are poorly conserved and that introns can vary in size from <100 bases to hundreds of kilobases. Although simple consensus sequences appear at the donor and acceptor terminals of all introns, the selection of splice sites must
be far more complex in order to explain the faithful generation of fully
processed mRNAs from genes which can span great distances of the chromosome.
Berget
et al
. (
12
) has proposed the exon definition model to explain splice site selection in
vertebrate genes. The model proposes that the exon is the unit of recognition
and that the splicing machinery recognizes a pair of splice sites (an acceptor
site at the 3' end of the upstream intron and a donor site at the 5' end of the downstream intron) surrounding an exon. One prediction
of this model is that the exon would be defined as a sequence of a limited size
between donor and acceptor sites. Mechanistically, the model proposes that
intron splice sites are identified after the splicing machinery, including U1
and U2 snRNPs and 5' and 3' splice site recognition factors, recognizes and binds the
intron/exon junction of internal exons. Intron definition then ensues through
the assembly of a splicing complex with the subsequent joining of upstream and
downstream exons. It remains possible that the limitation on exon size along
with the normal small size of most vertebrate exons reflects a maximum distance
in which splicing factors can interact across an exon, and may have arisen to
prevent the recognition of cryptic splice sites within internal exons which
would disrupt protein coding.
In conclusion, the results described in the report are consistent with the
skipping of large internal exons as proposed by the exon definition model. Our
results suggest that minimizing the size of internal chimeric exons should be
considered in the design of transgenes. Unfortunately, the outcome of any particular experimental transgene design is still difficult to predict
a priori
.
We would like to thank Drs Andrew Russo and Scott Moye-Rowley for their critical review of the manuscript, and to Julie Lang, Norma Sinclair and Lucy Robbins for their superb technical assistance.
This work was funded by grants from the NIH (HL48058-03) and American Heart Association. Curt D. Sigmund is an Established Investigator of the American Heart Association. Robin
L. Davisson is funded by an institutional post-doctoral fellowship from the NIH (HL07121-20). Transgenic mice were generated and maintained at the University of Iowa Transgenic Animal Facility which is supported in part by the College of Medicine and
the Diabetes and Endocrinology Research Center. DNA sequencing was performed by
the University of Iowa DNA Core Facility.
+
Present address: Department of Pharmacology and Toxicology, University of Oulu,
Kajaanintie 52 D, 90220 Oulu, Finland


Whatever the mechanism, it is clear that introns should be considered a critical component when designing transgene constructs. The data
reported to date suggests that elements which regulate expression of a gene,
such as flanking DNA and promoters, function in concert with sequences present
within the coding region and in introns at the level of chromatin, suggesting
that the context in which introns are presented can be a crucial feature.
Indeed, this is supported by experiments in which heterologous introns are able
to rescue expression of an intronless transgene when placed between the
promoter and cDNA but not when placed downstream of the cDNA (
3
). This would suggest that in the generation of chimeric constructs the most
desirable design would be to retain as much of the normal genomic structure as
possible.
REFERENCES
Return

