Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (171K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (278)
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Edwalds-Gilbert, G.
Right arrow Articles by Milcarek, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Edwalds-Gilbert, G.
Right arrow Articles by Milcarek, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 1995 Oxford University Press 2547-2561

Alternative poly(A) site selection in complex transcription units: means to an end?

Alternative poly(A) site selection in complex transcription units: means to an end? Gretchen Edwalds-Gilbert, Kristen L. Veraldi and Christine Milcarek*

Department of Molecular Genetics and Biochemistry and the Graduate Program in Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261-2072, USA

Received January 28, 1997; Revised and Accepted May 9, 1997

ABSTRACT

Many genes have been described and characterized which result in alternative polyadenylation site use at the 3'-end of their mRNAs based on the cellular environment. In this survey and summary article 95 genes are discussed in which alternative polyadenylation is a consequence of tandem arrays of poly(A) signals within a single 3'-untranslated region. An additional 31 genes are described in which polyadenylation at a promoter-proximal site competes with a splicing reaction to influence expression of multiple mRNAs. Some have a composite internal/terminal exon which can be differentially processed. Others contain alternative 3'-terminal exons, the first of which can be skipped in some cells. In some cases the mRNAs formed from these three classes of genes are differentially processed from the primary transcript during the cell cycle or in a tissue-specific or developmentally specific pattern. Immunoglobulin heavy chain genes have composite exons; regulated production of two different Ig mRNAs has been shown to involve B cell stage-specific changes in trans-acting factors involved in formation of the active polyadenylation complex. Changes in the activity of some of these same factors occur during viral infection and take-over of the cellular machinery, suggesting the potential applicability of at least some aspects of the Ig model. The differential expression of a number of genes that undergo alternative poly(A) site choice or polyadenylation/splicing competition could be regulated at the level of amounts and activities of either generic or tissue-specific polyadenylation factors and/or splicing factors.

INTRODUCTION

In the nuclei of eukaryotic cells, precursor RNAs made by eukaryotic RNA polymerase II undergo a series of post-transcriptional processing events to produce mature mRNAs which are then exported to the cytoplasm. These modifications generally include methylation of the 2' hydroxyl group of the ribose sugar(s) near the co-transcriptionally added cap, splicing to remove introns and cleavage with subsequent polyadenylation (addition of ~200 adenosines) at the 3'-end; internal base editing and methylation of the 6 position of adenosines also occur with some RNAs. The precursor RNAs for most genes are not processed efficiently and instead are rapidly degraded in the nucleus; the bulk of the RNAs never reach the cytoplasm. Therefore, small changes in overall RNA processing efficiency in a particular cell or the effective strength of a particular splicing or polyadenylation site can serve as an important control point for gene expression in a tissue or developmental stage-specific manner. A functional polyadenylation signal is required for transcription termination by RNA polymerase II (1 -3 ); transport of the message from nucleus to cytoplasm is dependent on polyadenylation and splicing (4 -7 ) and these processes are apparently coupled through the C-terminal domain of RNA polymerase (8 ). The more efficient a poly(A) site is at processing in vitro, the more efficient it is at generating termination-competent RNA polymerase II elongation complexes and mature RNA (9 ). Poly(A) site strength can directly influence the amount of cytoplasmic RNA produced from a transcript (10 );therefore, changing polyadenylation efficiencies can have a profound effect on the amount and nature of a gene product. In the cytoplasm, the poly(A) tail on the message plays a role in stability and translatability (11 -13 ) and stabilizes RNA from degradation by preventing association with the degradation machinery (14 ). While initiation of RNA polymerase II transcripts is an important starting point in gene expression, the job is not done until the poly(A) tail is added and the mRNA is exported and translated in the cytoplasm. Transcription ends well only when the message ends well; therefore, polyadenylation is an important means to the end of the message.

Splicing factors can be categorized into three broad groups based on activity: constitutive factors which are involved in the generic splicing reaction, positive regulators which improve recognition of weak alternatively used splice sites and negative regulators which antagonize the positive regulators. A recent review detailed a variety of splicing options which can be exercised by tissue-specific changes in the amounts of the various splicing factors as well as the sequences at the splice sites and their positioning within the pre-mRNA (15 ). Detailed mechanistic studies of the events surrounding the polyadenylation/cleavage reaction are just emerging. This review examines the large number of genes described to date in which the regulation of polyadenylation may play an important role.


Figure 1. The 3' portions of genes leading to potential alternative poly(A) site selection. Three motifs of exon arrangement can produce alternative poly(A) site selection depending on the cellular environment; they are diagramed here for comparison. All black exons have typical 3' and 5' splice sites. Shaded or white exons are described below and in the text. (A) A large number of genes have multiple poly(A) sites within the 3'-UTR in the shaded box, arranged one behind the other, i.e. in tandem. Only two tandem sites, pA1 and pA2, are shown for simplicity, but more may be present. The site of the termination codon of the protein in the 3'-terminal exon is indicated by the vertical bar and the asterisk. Genes with this structure are listed in Table 1. Some genes have been shown or suggested to have a regulatory element (R.E.) between the poly(A) sites which can influence the stability or translatability of the longer mRNA (see Table 1). (B) Another set of genes have an exon (white in the diagram) which is a composite of 3' and 5' splice sites followed closely by a poly(A) site; these composite exons can serve as internal or terminal (i.e. in/terminal) based on circumstances; the potential 5' splice site and the termination codon of the protein in the composite 3'-terminal exon is indicated by the vertical bar and the asterisk. The alternatively processed mRNA encodes a different protein terminus (*') from use of 3'-UTR-2. Genes with a similar organization are listed in Table 2. As discussed in the text, in the Ig heavy chain genes the poly(A) site in the composite exon (pA1) encodes the C-terminus of the secreted form (*), while the 3'-terminal exon downstream contains the second poly(A) site (pA2) and encodes the C-terminus of the membrane-bound form of Ig (*'). (C)The genes described in Table 3 have two or more alternative 3'-terminal exons encoding different C-termini (* or *'). They are processed into different mRNAs either by using the first 3'-terminal exon and its poly(A) site (pA1) or by skipping over that exon entirely and using the second 3'-terminal exon with its poly(A) site (pA2) in 3'-UTR-2. As discussed in the text, calcitonin protein is encoded by mRNA using pA1, while CGRP is encoded by mRNAs which have pA2 at the 3'-end. Female flies use the pA1 site in the dsx gene, while males use pA2.

Three sequence elements determine the precise site of 3'-end cleavage and polyadenylation in mammalian pre-mRNAs. The highly conserved poly(A) signal, the hexanucleotide AAUAAA, is present in the 3'-untranslated region (3'-UTR) near the mature mRNA end in many genes (~80%), while AUUAAA is found less frequently. Other less conserved AU- or A-rich sequences have been observed near the 3'-end of mature mRNA in a smaller fraction of cases and function less well in vitro than AAUAAA (16 ). Cleavage occurs at the poly(A) addition site ~11-23 nt downstream of the hexanucleotide. The third element is a GU- or U-rich region, usually 10-30 bases downstream of the cleavage site (17 -20 ). The AAUAAA or some variant of it, the downstream region and their relative positions define the approximate site at which cleavage will occur for most poly(A) sites (21 ); secondary structure can shift the site of poly(A) tail addition slightly (22 ). In some viral genes there is an additional element upstream of the hexanucleotide which can aid in efficient poly(A) site recognition (23 -30 ); the U1 snRNP-specific protein U1A can activate SV40 late polyadenylation by interaction with a sequence element upstream of that poly(A) site and the polyadenylation factor CPSF (31 ,32 ). Upstream elements have been identified in a few eukaryotic genes, such as the complement C2 gene (33 ) and the immunoglobulin (Ig) [gamma]2a membrane-specific poly(A) site (34 ,35 ), although the mechanisms of their action and the factors involved are less clear.

The biochemistry of pre-mRNA cleavage and polyadenylation has been well characterized and reviewed recently (36 -39 ). Protein factors required for accurate cleavage and polyadenylation have been isolated from HeLa cell nuclear extracts and calf thymus. These factors include: cleavage and polyadenylation specificity factor (CPSF) (40 -42 ), cleavage stimulatory factor (CstF) (43 ), poly(A) polymerase (44 -47 ), the phosphorylation and activity of which varies during the cell cycle (48 ), cleavage factors Im and IIm (49 ,50 ) and poly(A) binding protein II, which binds to the growing poly(A) tail (51 ). These factors interact to form a complex on the precursor RNA prior to the cleavage and polyadenylation reactions. The key factors are conserved between yeast and mammals; they are, in fact, better conserved than the cis-acting elements (reviewed in 52 ).

CPSF is a multisubunit factor that recognizes the poly(A) signal sequence AAUAAA (43 ,53 ,54 ) and is required for both cleavage and polyadenylation (40 ,42 ,43 ). CstF is required for the cleavage reaction (55 ) and stabilizes interaction of AAUAAA with CPSF via protein-protein interactions and protein-RNA interactions with the region downstream of the poly(A) site. Sites which interact strongly with CstF in vitro are strong sites in vivo (55 ,56 ). The 64 kDa subunit of human CstF contains an RNA-binding domain which can be crosslinked to RNAs containing GU- or U-rich regions downstream of the poly(A) site in the presence of CPSF, AAUAAA and the other subunits of CstF (57 ,58 ). The 50 kDa subunit of human CstF exhibits regions of identity and similarity with mammalian G protein [beta] subunits and has seven copies of a characteristic transducin repeat motif (59 ). The 77 kDa subunit of CstF bridges the 64 and 50 kDa subunits and contains the putative nuclear localization signal most likely responsible for transporting the 50:77:64 trimer into the nucleus, where CstF functions (60 ). The 77 kDa subunit of CstF shares extensive homology with the Drosophila modifier supressor of forked, su(f); mutations of su(f) change the relative utilization of poly(A) sites in genes with inserted transposable elements. The strong homology suggests that human CstF may also regulate expression of specific poly(A) sites.

U1 snRNP and its A protein appear to play central roles in the coupling of splicing and polyadenylation (61 ). It is well established that U1 snRNP is involved in 5' splice site recognition and definition of internal exons (62 ). The A protein of U1 can interact with the 160 kDa subunit of CPSF and activate polyadenylation (32 ). The A protein of U1 can also bind to its own pre-mRNA and inhibit poly(A) polymerase by interaction with the C-terminus of the polymerase, thereby controlling U1A production (61 ,63 ). Furthermore, binding of U1 to 3'-terminal exons has been observed and a correlation between U1 binding and polyadenylation activity was established (64 ). These observations are consistent with an involvement of U1 in the definition of 3'-terminal exons and coordination of splicing and cleavage/polyadenylation. U1 snRNP therefore plays a pivotal role in precursor RNA processing and subtle changes in its amount, characteristics or factors with which it interacts could profoundly influence alternative poly(A) site choice.

GENES WITH TANDEM TERMINAL POLY(A) SITES, A LARGE AND GROWING FAMILY

Although the majority of eukaryotic gene transcription units possess a single polyadenylation signal, numerous examples of transcription units with multiple poly(A) sites, all within a single 3'-terminal exon, have been described over the past several years; we have compiled these in Table 1 and contrast them in Figure 1 A-C with the other types of genes whose messages could have multiple poly(A) sites. We have included in the tables only those genes for which there is solid evidence for more than one RNA species, i.e. Northern blots or nuclease protection assays; not included are a large number of other genes which have been sequenced and suggested, but not directly shown, to encode more than one mRNA because of several potential poly(A) signals in the 3'-end. Space limitations prohibit us from including a complete bibliography on each gene listed, so we have chosen to include either the first or the most comprehensive reference for each entry in the table.

Of the genes listed in Table 1 having multiple, tandem poly(A) sites in a 3'-terminal exon, many have not been examined in enough different tissues and cell types to determine if there is differential processing of the poly(A) sites. Use of these multiple sites may be regulated or may instead reflect random use of signals with varying inherent strengths. Polyadenylation may be important enough to nuclear processing and export to warrant two or more chances at recognition by the cellular machinery.

There are at least 33 genes listed in Table 1 which show changes in the distribution of 3'-ends of mRNAs produced based on the time in development, growth state of the cell or tissue in which they are expresssed, with testes being a hotspot for differential poly(A) site use. These genes with differential expression are indicated within the Notable features columns. Dihydrofolate reductase (DHFR) is an extreme example of multiple poly(A) sites, with seven spread over 5 kb of sequence. Transcription proceeds through all seven poly(A) sites and occurs 1 kb downstream of the last one (65 ), indicating that the multiple forms of mRNA arise by processing and not transcription termination between some of the sites. S1 protection analyses of steady-state DHFR mRNAs from growing versus resting cells show a different distribution of 3'-ends in the two stages (66 ), although it is not known if these are a consequence of cell cycle-specific differences in stability or processing. The same question of nuclear processing versus differential cytoplasmic stability arises with some of the examples in the tables, although for others the experiments addressing this question have been done and the results noted.

Since the multiple forms of mRNA in Table 1 generally differ only at the 3'-end and not in the coding regions, it may not be obvious how differential poly(A) sites could influence protein expression. However, if the different forms of mRNA have different stabilities or translatability, then use of alternative poly(A) sites can positively or negatively impact on the final amount of protein product per unit precursor RNA transcribed. An interesting example of regulation of a gene with tandem poly(A) sites, listed in Table 1 , is the gene for eukaryotic initiation factor 2[alpha] (eIF-2[alpha]), a key factor for protein synthesis, in which there are multiple poly(A) sites in the 3'-UTR encoding two fairly common mRNAs of 1.6 and 4.2 kb (67 ). While both messages are on polyribosomes, the 1.6 kb message is less stable than the 4.2 kb mRNA species in T cells, but is more readily translated in vitro. Shifting to expression of the shorter message could increase the amount of protein produced. The ratios of the 1.6 to 4.2 kb species ranged from 1:1 in brain and skeletal muscle to 10:1 in placenta, liver and pancreas. T cells activated with ionomycin and PMA leave G0 and enter S phase; shortly following this treatment the 1.6 kb mRNA increased in abundance 11.5-fold, while the 4.2 kb mRNA increased only 4-fold, indicating that the poly(A) site leading to 1.6 kb mRNA is favored in S phase.

It was recognized many years ago that treatment of resting T cells with agents which caused them to proceed from G0 to S phase caused increases in polyadenylation enzyme activity (68 -70 ) and that an increased rate of polyadenylation of mRNA was a rapid response to entry into S phase (71 ). The recent observation of changes in poly(A) polymerase activity during the cell cycle (48 ) and increases in CstF-64 amount following B cell stimulation (72 ) indicate that changing the site of poly(A) addition on a given transcript may be a response to the cellular environment through activation of the polyadenylation machinery. Increases in polyadenylation activity favor use of the first poly(A) site in the eIF-2[alpha] primary transcript, ultimately producing more protein per primary transcript, both from more efficient nuclear RNA processing and transport and from more efficient cytoplasmic translation of the 1.6 kb mRNA.

A third poly(A) site for the eIF-2[alpha] gene is used only in testes to produce a 1.7 kb mRNA, indicating that in the testes a different distribution of polyadenylation factors may be recognizing different cis-acting elements in the primary transcript. The overall pattern of processing in the eIF-2[alpha] gene is conserved between mouse and man and is presumably important for its regulation. As indicated in Table 1 , seven genes clearly show a pattern of differential stability or translatability of the various mRNA products, including the cationic amino acid transporter gene, cyclic AMP- responsive element modulator, cyclooxygenase-2, eIF-2[alpha], histone H10, splicing factor PR264/SC35 and vascular endothelial growth factor. Fourteen more genes have been suggested to contain potential regulatory elements between the poly(A) signals, but the mRNA half-lives were not measured directly. Several other examples where an instability element was postulated but subsequently shown not to have differentially stable mRNAs are also indicated in Table 1 .

Table 1 . Genes with alternative tandem poly(A) sites in the 3'-UTR
Gene

 

Regulatory
elementa

Notable features
Where seen

 
Comments
Reference

 

23 kDa Transplantation antigen

 

Brain, retina

From the P198 gene, which is highly conserved [including the poly(A) sites]across mammalian species; two poly(A) sites 147
[alpha]-Galactosidase A

 

 

Mouse; three poly(A) sites, two mRNAs 148

Acetylcholinesterase

 

Muscle, brain

Mouse, human; two poly(A) sites; second site predominates in muscle, first site predominates in brain 149,150

Activin [beta]A subunit

 

TPA treatment

Human; eight possible poly(A) sites; treatment of HT1080 fibrosarcoma cells with TPA causes a shift over time to use of proximal poly(A) site 151,152

ADP ribosylation factor (ARF) 3 (ARF 4)

Testes

ARF 1 has two poly(A) sites conserved in human and rat.ARF 4 makes a short, testes-specific mRNA generated by alternative polyadenylation 154,155

Aldolase B

 

 

Mouse; one non-canonical and three canonical poly(A) sites, use of all four sites detectable in liver and kidney 156

Amphiglycan (syndecan 4, ryudocan)

 

Chondrocytes

At least two poly(A) signals; longer message is ubiquitous, shorter is tissue-specific; switch in poly(A) site use during chondrocyte differentiation 157,158

Amyloid protein

2, 4

 

Sequence between two poly(A) sites increases translation of the longer mRNA 159

Androgen receptor

 

 

Human; two poly(A) sites, the first is AUUAAA and the second is CAUAAA 160

Angiotensin converting enzyme (ACE)

 

Testes, pulmonary tissue

Rabbit; testes- versus pulmonary-specific forms 161

Ankyrin-1

 

Brain

Mouse; both poly(A) sites used in erythroid tissues, distal site used in cerebellum 162

Apolipoprotein B

 

Intestine

Putative cryptic poly(A) site improved by editing 163

Arylsulfatase A

 

 

Mutation of first poly(A) signal seen in arylsulfatase A pseudodeficiency 164

Axonin-1   Retina, brain Chicken; three poly(A) sites 165
[beta]-Tubulin

 

HSV infection

Changes in the ratio of the two forms occur during HSV infection 166, 167
[beta]2-Microglobulin     Murine; two poly(A) sites 168
[beta]3-Adrenergic receptor     Human; rat; two poly(A) sites 169
Band 7.2b gene

 

Many cell types

Human; integral membrane phosphoprotein; distal site predominates in all tissues, proximal site use is significant in lung, liver and kidney and minimal in spleen 170

Brain-derived neurotrophic factor (BDNF)

 

Heart, lung, brain

Rat; isoform production controlled by alternative splicing; multiple promoters used; two poly(A) sites, ratio of proximal: distal site use varies among heart, lung, cerebral cortex 171

Cationic amino acid transporter gene (cat-1)

1

Cell density

Rat; relative concentration of two mRNAs is regulated by cell density 172

c-Mos

3

 

Porcine; protooncogene whose expression is restricted to gonadal tissues in the pig; alternative polyadenylation may play a role in translation 173

CD40

3

Differentiation

murine; differential poly(A) site use during B lymphocyte activation 111

CD59 (membrane inhibitor of reactive lysis)

3

Many cell types

Human, complement regulatory protein; four possible poly(A) sites; use of two poly(A) sites varies in different cell lines 174,175

Chymotrypsin-like protease

 

 

Human chromosome 16q22.1; alternative polyadenylation creates transcription unit which overlaps with oppositely oriented gene 176

Clathrin heavy chain gene

3

Developmental changes

Mosquito; poly(A) site use differs between somatic cells and germ cells 177

Collagenase 3

 

 

Human; three mRNAs seen in mammary carcinoma cells 178

cAMP-responsive element modulator (CREM)

1

Testes

Follicle-stimulating hormone regulates CREM expression in testes by changing poly(A) site use, causing an increase in mRNA stability 179

Cyclin D1

 

Developmental changes

Human, mouse, zebrafish; two poly(A) sites; change in poly(A) site use during zebrafish embryonic development; one major and two minor forms found in HeLa and all hematopoetic cells tested 180-182

Table 1. (Continued)
Cyclooxygenase-1 (COX-1)  

 

Human; two poly(A) sites

183

Cyclooxygenase-2 (COX-2)

1

Dexamethasone treatment

Expression induced by cytokines; three poly(A) sites; dexamethasone treatment selectively destabilizes longer mRNA 184
Cytochrome P450 aromatase

 

Many cell types

Human; two poly(A) sites, [second poly(A) signal AUUAAA]; mouse,porcine, equine; two poly(A) sites, 2.5 kb mRNA predominant in ovaries 185-188
Cytochrome P450-linked ferredoxin  

 

Mouse; two poly(A) sites

189

Dihydrofolate reductase (DHFR)

 

Cell cycle

Seven poly(A) sites; promoter-proximal site used during growth stimulation 66, 190

Dipeptidyl peptidase IV (CD26)

 

 

Mouse; two poly(A) sites in exon 26, proximal poly(A) site predominates in all tissues examined 191

DNA polymerase [beta]

3

Testes; brain

Rat; the 1.4 kb transcript predominates in testes and has a poly(A) signal AAUGAA; 4 kb transcript predominates in brain 192

eIF-2[alpha] (translation initiation factor 2[alpha])

1, 4

Testes

Two poly(A) sites; different ratio in different tissues; the longer mRNA is more stable in activated T cells; the shorter mRNA has increased translatablilty; third poly(A) site used in testes 67

eIF-4E (translation initiation factor 4E)

 

Many cell types

Mouse; multiple poly(A) signals; 1.8 kb transcript predominates in mouse kidney and liver and in a pre-B cell line, S194. A 1.5 kb transcript is abundant in mouse thymus and in S194 cells. Minor mRNAs of 2.2 and 2.5 kb correspond to use of alternate poly(A) signals as well 193

eIF-5 (translation initiation factor 5)

 

Testes

Mammalian; proximal poly(A) site used predominantly in testes, distal site favored in other tissues examined 194

Excision repair gene ERCC6

 

Testes

Human; presumed helicase; two poly(A) signals, first is AUUAAA; shorter mRNA is primarily expressed in testes 195

Fanconi anemia group C (FACC)

3

 

Human; three poly(A) sites, longest transcript is most abundant and its poly(A) signal is AAUAAA; first two poly(A) signals are non-canonical; longest transcript contains a series of direct 35 bp repeats preceded by a 12 bp palindrome 196

Ferritin heavy chain

 

Many cell types

Human; two poly(A) signals; tissue-specific differences in ratio of use (brain, skeletal muscle versus placenta, liver, pancreas) 197

Fibroblast growth factor (int-2)

2

Retinoic acid treatment

Mouse; both mRNAs inducible in F9 cell line by treatment with retinoic acid 198,199

Basic fibroblast growth factor (bFGF) 2

Cell density

Use of two poly(A) sites varies with cell density 200

Fibroglycan (syndecan 2)  

 

Human; at least two functional poly(A) signals 201

FMR1     Fragile X gene; two poly(A) sites 202
G protein [gamma] subunit (D-G [gamma]1)

3

Many cell types

Drosophila; use of three different poly(A) sites is developmentally regulated and cell type specifc: 2.6 kb transcript found in head, 1.3 kb transcript found in body, 1.1 kb transcript more abundant in head than in body 203
Gastric capthesin E

 

 

Human aspartic protease; two poly(A) sites 204

GATA-2

 

 

Transcription factor; two poly(A) sites 205

Grg

 

 

Murine; related to the groucho transcript of the Drosophila Enhancer of split complex 206
Growth hormone receptor, avian  

 

Alternative poly(A) site in exon 5 generates short form, in the absence of alternative splicing, unlike mammalian counterpart 207
Heparan sulfate proteoglycan  

Liver, kidney

Rat; major cell surface heparan sulfate proteoglycan; three poly(A) sites used in most tissues, most proximal site used only in liver and kidney 208
Herpes simplex virus type 1 (HSV-1) UL24  

 

Increased polyadenylation at weak viral sites via effects on host cell CstF 64-kDa 136,140,142
High mobility group 1 protein (HMG1)  

 

Murine; three poly(A) sites

209

Histone H10

1

Butyrate treatment

Mouse; differentiation-specific histone H1; two mRNAs, first poly(A) signal is AUUAAA; minor 0.9 kb mRNA becomes more stable during butyrate-induced dedifferentiation, mRNAs equally stable after treatment with actinomycin D 210
Table 1. (Continued)
Huntington disease gene

 

Brain

Use of distal poly(A) site predominates in brain; most other tissues favor proximal poly(A) site 211
Integrin [alpha]5

 

 

Xenopus laevis; alternative polyadenylation occurs in the embryo 212
Interleukin-8 receptor [alpha]  

 

Human; two mRNAs equally abundant in neutrophils 213

Iron regulatory protein 2 (IRP2)

 

Intracellular iron levels

Human, rat; RNA binding protein whose affinity for its binding site is modulated by intracellular iron levels; increase in proximal poly(A) site use with reciprocal decrease in distal poly(A) site use in iron-depleted cells 214
Ketohexokinase (fructokinase)  

 

Human; two poly(A) sites; second is GAUAAA 215

Lamin B3

 

Testes

Mouse; germ cell (testes) specific RNA processing of lamin B2 generates lamin B3 216
Lipoprotein lipase

2

Many cell types

Human; longer transcript predominates in skeletal and cardiac muscle; adipose tissue produces both forms of mRNA; longer transcript translated more efficiently than short one 217
Long chain acyl-CoA dehydrogenase (ACAD 1)  

Many cell types

Mouse; two poly(A) sites

218

Manganese superoxide dismutase

 

Many cell types

Rat; five poly(A) sites; first two sites used in all tissues tested; proximal poly(A) site predominates in testes and liver, distal site used in heart, lung and kidney 219
Microtubule-associated protein 4 (MAP4)

 

Many cell types

Mouse; 3'-UTR well conserved between mouse and human; first two sites used in all tissues tested; third site used in muscle; fourth site used in testes, but first site predominates 220
Mitochondrial HMG-CoA synthase 3

 

Rat; two poly(A) sites: AUUAAA and AUUAUC 221

N-Formyl peptide receptor (FMLF-R)

 

Dibutryl cAMP treatment

Human; two-exon gene, at least two poly(A) sites; predominant use of proximal poly(A) site after treatment of HL60 human lymphoma cells with the differentiation agent dibutryl cAMP 222
NAD(P)H:quinone oxidoreductase

3

Mitomycin C treatment

Human colon cancer HCT 116 cells; two mRNAs; change in ratio after mitomycin C treatment 223
Non-muscle myosin heavy chain  

 

Human; two poly(A) sites

224

mal-1

 

 

Mouse; novel keratinocyte lipid-binding protein; tumor specific overexpression; two poly(A) sites, use of first one predominates 225
P-selectin glycoprotein ligand  

 

Human, chromosome 12q24; major mRNA species 2.5 kb, minor species 4 kb 226
Paramyosin

 

Developmental changes Drosophila; use of two poly(A) sites is developmentally regulated 227
Phosphofructokinase (PFK)

 

Developmental changes Drosophila; use of three poly(A) sites is developmentally regulated 228
Platelet-derived growth factor (PDGF)  

 

Three poly(A) signals

229,230

PR264/SC35

1

Many cell types

Human splicing factor; ratio of different forms varies among six different cell lines tested

231
rab2

2

Many cell types

Human Ras-related GTP binding protein; three potential poly(A) signals 232
RanGAP1

 

Testes

Human; activator of Ras-related nuclear GTPase Ran, shows testes-specific polyadenylation 233
Renal glutaminase

 

Many cell types

Rat; ratio of poly(A) site use varies in different cell lines 234

RHOA protooncogene

2

 

Human Ras-related GTP binding protein; found in breast cancer cell lines; three poly(A) sites

235
Senescence marker protein-30 (SMP-30)  

 

Rat; two poly(A) sites

236

set, putative oncogene associated with myeloid leukemogenesis  

Many cell types

Human, mouse; ratio of two mRNAs varies in different cell types and five cell lines tested; shorter mRNA predominates in liver and kidney 237

Soluble angiotensin binding protein 3

 

Porcine; two poly(A) sites, first is GAUAAA; longer transcript may be regulated by SINE element in 3'-UTR 238

Splicing factor 9G8

 

Many cell types

Human; two poly(A) sites; pre-mRNA also subjected to alternative splicing 239

Steel

3

 

Murine; encodes stem cell factor (SCF); distal poly(A) site used predominantly; 3'-UTR is 4.4 kb 240
Table 1. (continued)
Suppressor of forked su(f)     Drosophila; three mRNAs 241
Syndecan-1     Mouse; two poly(A) sites 242
Tissue inhibitor of metalloproteinases-2 (TIMP-2) 2   Human; two stable transcripts

243

Tissue inhibitor of metalloproteinases-3 (TIMP-3)  

TPA treatment

Murine; three transcripts of 2.3, 2.8 and 4.6 kb. 4.6 kb most abundant. All three transcripts induced in pre-neoplastic JB6 cells treated with TPA 244
Transforming growth factor alpha (TGF [alpha]) 3

 

Human; five possible poly(A) sites but only two mRNAs detected; use of distal poly(A) site (AAUGAA) predominates in most tissues 245
Triose phosphate isomerase

3

Testes

Rat; 1.4 kb mRNA found in most tissues and in somatic cells of testes; its level increases after retinol treatment; the 1.5 kb species is detected only in haploid spermatids 246
Tryptophanyl-tRNA synthetase  

 

Murine, human; two poly(A) sites, first is AAUCAA 247,248

Tubulin polycistronic pre-mRNA  

 

Trypanosomes; transcription unit undergoes trans-splicing and alternative polyadenylation, which may be coupled in this system 249

Vascular endothelial growth factor (VEGF) 1

Hypoxia

Rat; two poly(A) sites; regulation of poly(A) site use by hypoxia 250
WNT-5A     Human; expression in early embryogenesis 251
ZAKI-4

2

Many cell types

Human thyroid hormone-responsive gene; two mRNAs, first poly(A) signal is AUUAAA; short mRNA predominates in heart and brain, trace amounts found in liver; long mRNA predominates in skeletal muscle; no messages detected in placenta, lung, kidney, pancreas 252
aRegulatory element key: 1, one mRNA more stable than another; 2, mRNA stability differences suggested from sequence but subsequently RNAs found to be equally stable; 3, mRNA differences in stability or translation suggested from sequence but not tested; 4, one mRNA found to be better translated than another, in vivo or in vitro.

GENERATION OF ALTERNATIVE 3'-ENDS BY COMPETITION BETWEEN POLYADENYLATION AND SPLICING: COMPOSITE VERSUS SKIPPED EXONS

Two other major classes of gene organization leading to the generation of alternative poly(A) sites on mRNA are illustrated in Figure 1 B and C; the genes in each class are listed in Tables 2 and 3 . The final protein products of both types of genes can differ at their C-termini depending on which processing pathway is followed. Exons are generally categorized as 5'-terminal, internal or 3'-terminal with polyadenylation signals in the UTR. A number of genes listed in Table 2 contain composite exons in which 5' splice sites can sometimes be silent, causing them to behave as 3'-terminal exons, or sometimes be active, thereby causing them to behave as internal exons, depending on the tissues in which the gene is expressed; these we call composite, in/terminal exons. Genes like the immunoglobulin heavy chains have an exon serving either as the first 3'-terminal exon in one mRNA (use of pA1) or as an internal exon in a second mRNA which ends with a normal 3'-terminal exon found further downstream (use of pA2). The primary transcript from other genes like calcitonin/calcitonin gene-related peptide, listed in Table 3 , are processed into two mRNAs by using either the first alternative 3'-terminal exon with its poly(A) site (pA1) or skipping that exon entirely and splicing the second 3'-terminal exon into the transcript, using pA2 instead. The distance between the poly(A) sites in these two classes of genes can be quite large (>3 kb in Ig genes) and differential sites of transcription termination, between the poly(A) sites, could change the distribution of 3'-end use in mRNA. Levels of basal polyadenylation factors, splicing factors and termination factors could all contribute cell type-specific mechanisms leading to 3'-end formation. Considerations of differential stability of mRNA, as discussed with the genes described in Table 1 , also pertain with the genes in Tables 2 and 3 .

COMPOSITE EXONS WHICH CAN SERVE AS INTERNAL OR 3'-TERMINAL ELEMENTS

To understand the composite exon behavior, a number of studies have been done with synthetic constructs. The addition of an adenovirus 5' splice site to a 3'-terminal exon was shown to negatively affect polyadenylation of the adjacent poly(A) site in HeLa cells (73 ). A 5' splice site consensus sequence was necessary and sufficient to inhibit polyadenylation when inserted into a 3'-UTR of papillomavirus in BPV-1-transformed mouse cells (74 ). However, those 5' splice sites are quite vigorous sites; it is interesting to note that the 5' splice sites in most cellular composite exons are quite weak and may not bind tightly to all components of the splicing machinery. Furthermore, regions with limited sequence complementarity to the 5' splice site in a 3'-terminal exon were shown to have a positive effect on polyadenylation through an interaction with U1 snRNP, but when these sequences were mutated to more closely match the consensus, the positive effect was lost (64 ). The mechanism by which the choice is made to splice or polyadenylate a composite in/terminal exon probably varies based on the balance of splicing and polyadenylation factors present in the tissue. This potential balancing act has parallels with tissue-specific variations in the amounts of positive and negative alternative splicing factors which can influence alternative splicing (reviewed in 15 ).

Table 2 Genes with multiple poly(A) sites in competition with splice sites: composite `in/terminal' exons
Gene Notes on regulation References
(2'-5') Oligo A synthetase

Transcription induced by interferon-[beta]; distal poly(A) site favored after induction; proximal poly(A) site used predominantly during basal transcription 253,254

[beta]-Spectrin

Proximal poly(A) site used exclusively in erythroid cells; default pattern of pre-mRNA processing uses distal poly(A) site 255,256

C3b/C4b receptor (complement receptor type 1) Use of proximal poly(A) site yields secreted form of receptor; predominant membrane-bound receptor is generated by use of distal poly(A) site 257
Cek5

Chicken receptor protein-tyrosine kinase of the Eph subfamily; use of the proximal poly(A) site yields secreted form of kinase, whose expression is low relative to the full-length Cek5 receptor 258

Epidermal growth factor (EGF) receptor; human, chicken Proximal poly(A) site leads to production of secreted form of receptor, which can inhibit the activities of the membrane-bound receptor 259,260

exuperantia (exu)

Drosophila gene required for both oogenesis and spermatogenesis that undergoes sex-specific alternative pre-mRNA processing; tra-2 gene required for male-specific RNA processing 261,262

Fibrinogen [gamma]-chain

Rat pre-mRNA undergoes liver-specific choice of proximal poly(A) site; other cell types always use distal poly(A) site 263

Fibroblast growth factor (FGF) receptor

Secreted form of receptor generated by use of the proximal poly(A) site; membrane-bound forms are produced by use of distal poly(A) site; secreted form also binds FGF 264,265

GARS/AIRS/GART

Glycinamide ribonucleotide synthetase (GARS)/aminoamidazole ribonucleotide synthetase (AIRS)/glycinamide ribonucleotide formyltransferase (GART); enzyme required for purine synthesis; use of proximal site corresponds to production of the monofunctional enzyme; use of the distal site yields the trifunctional enzyme; all tissues examined favor distal poly(A) site 114, 266
Glucocorticoid receptor

[beta] form of receptor produced by use of the proximal poly(A) site; more abundant [alpha] form uses the distal poly(A) site 267
HER2/neu receptor

Protein tyrosine kinase receptor in which membrane-bound form is produced from mRNA using the distal poly(A) site; use of proximal poly(A) site leads to shorter, intracellular form of thereceptor; use of the proximal and distal poly(A) sites varies greatly in different tumor cell lines 268

Hepatocyte nuclear factor (HNF1/vHNF1)

Hepatocyte nuclear factor homeoprotein family important for liver-specific expression of a number of genes; poly(A) site choice and intron inclusion contribute to the generation of HNF1 isoforms, all of which contain different C-terminal domains, have distinct effects on transcription and can form homo- and heterodimers; mRNA levels for these isoforms vary in different tissue types and in some fetal versus adult tissues 269

Ig [alpha] heavy chain

Use of proximal poly(A) site produces mRNA encoding secreted form of antibody; use of the distal poly(A) site generates mRNA for membrane-bound antigen receptor; secretory-specific mRNA dominant in plasma cells whereas there are equal amounts of the two mRNAs in mature or memory B cells 90,99,270

Ig [epsilon] heavy chain Pattern of regulation similar to Ig [alpha] heavy chain pre-mRNAs 271

Ig [gamma] heavy chain Pattern of regulation similar to Ig [alpha] heavy chain pre-mRNAs 76,93,94,270

Ig [mu] heavy chain

Pattern of regulation similar to Ig [alpha] and to other Ig heavy chain pre-mRNAs; can also include transcription termination as a mechanism of proximal poly(A) site selection 75,270,272

Leukemia inhibitory factor receptor [alpha]-chain

Member of hemopoietin receptor family; murine gene produces a secreted [proximal poly(A) site] and membrane-bound form [distal poly(A) site], with increase in the secreted form during pregnancy 273

Nuclear factor I-B3

Distal poly(A) site favored in all tissues examined, proximal poly(A) site used in heart and skeletal muscle; protein encoded by the shorter mRNA acts as a transcriptional repressor 274

Plasma membrane Ca2+-ATPase isoform 3 Use of proximal poly(A) site specific to skeletal muscle and brain

275

Poly(A) polymerase

Component of polyadenylation complex; six isoforms generated via alternative splicing and polyadenylation; some isoforms found in all tissues examined, others show tissue-specific expression; use of one of three proximal poly(A) sites yields forms that contain the polymerase domain but not the serine/threonine-rich domain and nuclear localization signal (see also Table 3) 276

Sarco/endoplasmic reticulum Ca2+-ATPase (SERCA)

Five protein isoforms are generated from three different SERCA genes plus alternative processing events; regulation of expression is both developmental and tissue specific and is suggested to be at the level of splicing rather than polyadenylation; two SERCA2 protein isoforms are translated from four different mRNAs generated by tissue-dependent alternative processing, one of which is brain specific; SERCA2a protein is muscle specific, SERCA2b is found in non-muscle tissues and smooth muscle 112,113

Secretory PLA2 receptor

Receptor has similar structural organization to macrophage mannose receptor; acts as a mediator of inflammatory processes; secreted form of phospholipaseA2 receptor found in human kidney; membrane bound receptor is widely expressed, including in kidney 277

Thyroid hormone receptor [alpha] (c-erbA-1)

Proximal poly(A) site yields [alpha]1, which binds thyroid hormone; distal poly(A) site produces [alpha]2, which cannot bind thyroid hormone; ratio of two mRNAs varies in different tissues; [alpha]2 transcript overlaps with gene transcribed in opposite direction, Rev-ErbA[alpha] 115

Table 3 . Genes with multiple alternative 3'-terminal exons: skipped exons
Gene Notes on regulation References
[alpha]-Tropomyosin

At least four poly(A) sites; proximal poly(A) site used in striated muscle and distal poly(A) site used in smooth muscle and fibroblasts; three of the poly(A) sites used in brain 278,279

Adenovirus major late transcription unit

Five poly(A) sites; the proximal poly(A) site, L1, used predominantly in early infection; L3 dominates late in infection 153

[beta]-Tropomyosin

Proximal poly(A) site used exclusively in skeletal muscle; other cell types use the distal poly(A) site; regulation may be at the level of splice site choice 280,281

Calcitonin/calcitonin gene-related peptide (CGRP)

Proximal poly(A) site used in most cell types, generating the mRNA for calcitonin; distal poly(A) site used exclusively in neuronal cells, leading to production of CGRP 119,120,125

doublesex (dsx)

Drosophila gene required for somatic sexual differentiation that undergoes sex-specific alternative pre-mRNA processing; tra-2 protein required for regulated RNA processing and acts through its binding site in the dsx pre-mRNA 127,282

Epidermal growth factor (EGF) receptor; rat

Proximal poly(A) site leads to production of secreted form of receptor, which can inhibit the activities of the membrane-bound receptor; differs from human and chicken isoforms (see Table 2) 259,283

FLT4 receptor tyrosine kinase

Ratio of the mRNAs using the proximal or distal poly(A) site varies in different cell lines 284

Neural cell adhesion molecule (NCAM) Ratio of the mRNAs produced varies in different cell types 285

Plasma [alpha](1,3)-fucosyltransferase (FUT6) Two poly(A) sites are used equally in liver; proximal poly(A) site favored in colon; distal poly(A) site used predominantly in kidney 286

Poly(A) polymerase

Component of polyadenylation complex; six isoforms generated via alternative splicing and polyadenylation; some isoforms found in all tissues examined, others show tissue-specific expression; use of one of three proximal poly(A) sites yields forms that contain the polymerase domain but not the serine/threonine-rich domain and nuclear localization signal (three exons also composite; see Table 2) 276

Unique human gene of unknown function

Spans over 230 kb in human chromosome 8p11-12; codes multiple proteins sharing RNA binding motifs 287


IMMUNOGLOBULIN GENES

The Ig heavy chain genes represent the best-studied examples of complex transcription units in which composite exons can switch between being internal or 3'-terminal; their differential expression during B cell development may provide insights into the expression and trans-acting factors operating on the processing of some of the other 19 members of the group listed in Table 2 . Ig [mu] heavy chains are expressed in pre-, immature and mature B cells and some plasma cells. The [alpha], [delta], [epsilon] and [gamma] heavy chains are expressed in memory and plasma cells. RNA from each of the five classes of immunoglobulin heavy chain genes ([alpha], [delta], [epsilon], [gamma] and [mu]) can be alternatively processed to produce two types of mRNAs, one encoding the membrane-bound receptor for antigen on the surface of mature and memory B cells, the other encoding the secreted form of the Ig protein (75 -77 ). Polyadenylation at the secretory-specific poly(A) site and splicing in of the membrane-specific exons to the composite in/terminal exon are two mutually exclusive events. During the development of B cells there is a regulated shift from production of the membrane- to the secretory-specific form of Ig mRNA and protein; the secretory-specific forms predominate in terminally differentiated plasma cells (reviewed in 78 ,79 ) and can exceed the membrane form by 100:1. The total amount of cytoplasmic Ig mRNA also increases by 30- to 100-fold in plasma cells. Differences in mRNA stability alone cannot account for the shift to secretory-specific mRNA production, for while there is an increase in the half-life of Ig mRNA following differentiation to the plasma cell stage (80 -84 ), this increase occurs equally with both the secretory- and membrane- specific species (83 ,84 ). However, the 5-fold increased transcription of the Ig locus coupled with a more efficient conversion of the primary transcript to mature secretory-specific mRNA by increased polyadenylation would contribute significantly to both the abundance increase and the shift towards secretory species.

For the [mu] heavy chain gene, the site of transcription termination shifts from downstream of the membrane exons in B cells to a region between the secretory poly(A) site and the membrane exons in plasma cells (85 -88 ). In some plasma cells the membrane exons are not even transcribed and secretory-specific mRNA results. In contrast, there is no change in the site of transcription termination for Ig [alpha] and [gamma] heavy chains; here termination always occurs at approximately the same location, ~1 kb downstream of the last membrane exon (35 ,89 ,90 ). Therefore, changes in the site of transcription termination play a role in the expression of Ig [mu] but not Ig [gamma] or [alpha] secretory-specific mRNA. This difference may be the result of the unique location of the Ig [mu] heavy chain exons, which lie only 9 kb upstream of the [delta] constant region exons; both sets of exons are expressed in a common precursor at the mature B cell stage (91 ). The extent of [delta] gene transcription is also regulated by differential termination and polyadenylation, which involves sequence elements both within the [mu] membrane poly(A) site and a segment between the [mu] and [delta] coding sequences (92 ).

RNA processing events play a major role in determining the final amounts and the ratios of the two forms of Ig mRNA. Early experiments demonstrated that during B cell differentiation use of alternative cleavage/polyadenylation sites modulates the production of the two mRNAs from an Ig [gamma] gene (93 ,94 ) and from the Ig [mu] gene (85 ,86 ,95 -97 ). In later studies, an increase in the efficiency of polyadenylation at Ig secretory-specific poly(A) sites was seen in plasma cells versus mature and memory B cells for [mu] and [alpha] (98 ,99 ) and [gamma] sites (100 ,101 ), as measured by the relative use of tandem poly(A) sites in a 3'-terminal UTR in vivo. Transfection experiments have failed to identify cis-acting sequences within the immunoglobulin [gamma] gene responsible for the observed regulation of poly(A) site choice (102 ) or the [mu] splicing versus poly(A) choice (103 ). Attempts to determine which is the default pathway in non-B cells, Ig secretory or membrane mRNA production, have given different answers based on the Ig heavy chain gene, with a study of the IgG gene indicating that the membrane processing pattern occurs predominantly in non-lymphoid cells (104 ), while transfection of a hybrid SV40/IgM gene into a variety of non-lymphoid cells indicated that the secretory processing pattern was the default (103 ). Neither study determined the potential differences in Ig transcription termination sites or mRNA stabilities in non-lymphoid cells which might influence the interpretations. While there is a balance between polyadenylation and splicing which shifts in B cell development (79 ), there is no measurable change in efficiency of splicing either between B cell stages or in comparison with several different non-B cell lines (95 ,98 ,104 -106 ). Therefore, the differential expression of Ig heavy chain genes must primarily be the result of changes in the trans-acting factors responsible for polyadenylation.

MECHANISMS OF POLY(A) SITE CHOICE IN B CELLS

The relative levels of polyadenylation, splicing and transcription termination factors might be expected to play a role in the modulated expression of the two forms of mRNA from the Ig gene. When the gene for the 64 kDa subunit of CstF, driven by an actin promoter, was over-expressed 10-fold in a chicken B cell line an 8-fold shift toward the use of the promoter-proximal, secretory-specific poly(A) site in the endogenous Ig [mu] gene was seen (72 ). Increasing the amount of a limiting component of the CstF complex increased the amount of the complex in the nucleus, thereby increasing polyadenylation efficiency by mass action. This same study also showed that there was an increase in the amount of the 64 kDa protein when resting mouse splenic B cells (mature B cells) were stimulated by lipopolysaccharide treatment to grow and secrete Ig (summarized in Table 4 ). Therefore, at least during the transition from the resting B cell to the growing lymphoblast, an increase in 64 kDa CstF protein can play a role in increasing Ig secretory mRNA expression. However, comparisons of continuously growing cell lines which accurately represent Ig expression at various B cell stages have shown that the shift to production of the secretory-specific form of mRNA can increase from 30- to 100-fold in the absence of a change in the level of 64 kDa protein in the nucleus (107 ,108 ) or in the whole cell (unpublished observation). Therefore, a mechanism other than an increase in the amount of the 64 kDa subunit of CstF must be operative in plasma cells and tumor lines derived from them, to shift RNA processing towards production of secretory-specific forms of Ig mRNA. This issue was recently discussed (39 ,72 ).

The binding activity but not the amount of several constitutive factors required for cleavage and polyadenylation increases in continuously growing plasma cells producing large amounts of secretory-specific Ig mRNA (see Table 4 ). There is as much as an 8-fold increase in binding to input substrates of the 64 kDa subunit of CstF and the 100 kDa subunit of CPSF, two constitutive polyadenylation factors, in myeloma/plasma cell nuclear extracts as compared with lymphoma (early or memory B cell) extracts (107 ,108 ). These increases in binding occur regardless of the sequence of the polyadenylation-competent substrates as long as the substrates contain both an AAUAAA and a downstream element. Another activity was described in early/memory B cell extracts which seems to selectively destabilize complexes formed on weak poly(A) sites such as the immunoglobulin secretory-specific site (109 ). The activity of this factor on the dissociation of RNA-protein complexes formed on the membrane poly(A) site is less than that seen on the secretory-specific site.

Induction of a novel 28-32 kDa nuclear RNA binding factor in mouse splenic B cells was found to correlate with production of the secretory form of IgM heavy chain (110 ). Treatment of cells with both lipopolysaccharide and anti-[mu] antibodies, which allows for growth but not secretion, caused inhibition of Ig secretory-specific mRNA production but did not decrease induction of the 28-32 kDa RNA binding protein. Instead, another RNA binding protein of 50-55 kDa was produced; this protein binds to both secretory- and membrane-specific [mu] poly(A) sites in vitro. These proteins have not been identified further, but may represent positive and negative regulators of the secretory-specific polyadenylation complex, based on their binding specificities.

A postulated activator of polyadenylation/cleavage in plasma cells (107 ,108 ), which could act on any weak poly(A) site, together with the loss of a distinct inhibitor of the Ig secretory-specific poly(A) site, as postulated (109 ), could stabilize the polyadenylation complex formed at the secretory polyadenylation/ cleavage site; this would allow the weak secretory site to be used to the exclusion of splicing in the composite exon in plasma cells. In early/memory B cells the secretory-specific poly(A) site cannot effectively compete for polyadenylation factors and the composite in/terminal exon functions as an internal exon; the membrane poly(A) site is then used by default.

Table 4 . Changes in CstF-64 accompanying alternative poly(A) site selection
Cell status

 

Gene expressiona

 

CstF-64 protein
Amount

RNA binding activity

 

Resting B cell Ig sec = mb Low ?
Activated/growing B cells Ig sec = mb High Low
Secreting/growing plasma cells Ig sec >> mb High High
HeLa early in adenovirus infection Adeno L1>L3

High

High

HeLa late in adenovirus infection Adeno L1<L3

High

Low

HeLa late in HSV infection

Late HSV

High

Weak viral sites activated via ICP27
asec, secretory-specific form; mb, membrane bound-specific form.

Activation of weak polyadenylation sites in plasma cells occurs with a variety of transfected sequences (102 ). In addition, examination of early/memory stage B cells shows that they tend to accumulate more mRNA in the nucleus than do plasma cells. The effect seems more pronounced for secretory-specific Ig mRNA than for some other cellular RNAs (84 ), perhaps as a consequence of its extremely weak poly(A) site. The shifts in CstF-64 activity described above might increase the cytoplasmic abundance of a group of endogenous transcripts with weak poly(A) sites as well as shift poly(A) site location in complex transcription units like those described in Tables 1 and 2 and perhaps, but less likely, those listed in Table 3 . CD40, described in Table 1 , is a gene with two poly(A) sites whose relative use changes during B cell development (111 ) and might represent a member of the group of endogenous transcripts co-regulated with Ig secretory-specific mRNA through alternative poly(A) site choice. Before B cells are activated both mRNAs are produced; after activation the shorter, potentially more stable mRNA is the predominant species. The change in the factor(s) influencing use of the secretory-specific Ig poly(A) site could therefore have a broader effect on differentiation in B cells by influencing the expression of many mature mRNAs through poly(A) site selection.

OTHER GENES WITH A COMPOSITE EXON

A gene listed in Table 2 that was not influenced by changes in the levels or activities of polyadenylation factors in B cells is the gene for the Ca2+-transport ATPases of the sarcoplasmic or endoplasmic reticulum (SERCAs). Tissue-specific alternative 3'-end processing of SERCA2 pre-mRNA gives rise to two distinct protein isoforms (2a and 2b) which differ in their C-terminal portions (112 ). SERCA2a is found in cardiac, smooth and slow twitch skeletal muscle, while SERCA2b is found in smooth muscle and non-muscle tissues. No change in expression pattern was seen when B cells representing different stages of development were transfected with SERCA2 constructs (113 ), indicating that expression of the SERCA2 alternatively processed forms may result from tissue-specific alternative splicing instead of regulated polyadenylation or that the factors which are changed in B cells are not the ones necessary to influence SERCA pre-mRNA polyadenylation.

The gene encoding GARS/AIRS/GART (Table 2 ) seems to make both the mRNA products regardless of the tissue type (114 ) or B cell stage (L.Souan, Masters Thesis, University of Pittsburgh). Therefore, the changes in processing factors which are able to alter the processing fate of some exons may not operate on all similarly organized genes. The reasons for this remain unclear.

One of the mammalian thyroid hormone receptors is encoded by the erbA[alpha] gene, which can give rise to two mRNAs by the composite, in/terminal exon mode (Table 2 ). These two mRNAs give rise to receptor isoforms with antagonistic functions. The levels of the two mRNAs vary in different tissues and at different developmental stages (115 ,116 ). The Rev-erbA[alpha] gene is encoded on the DNA strand opposite erbA[alpha] and produces an antisense RNA with complementarity to the 3'-end of the mRNA using the second poly(A) site (117 ). When expression of erbA[alpha] was examined in B cells representing different stages in development it was shown to vary; however, the relative levels of the two forms of erbA[alpha] mRNA depended not on B cell stage but rather on the amount of Rev-erbA being expressed (118 ). This unusual mechanism for regulating alternative 3'-ends is common in viruses, but much less so in eukaryotic genes.

THE SKIPPED EXON GENES

The genes listed in Table 3 and diagramed in Figure 1 C can encode two or more mRNAs by using classical 3'-terminal exons which are arranged so that the first can be skipped over. The regulated expression of these genes may be sensitive not only to the levels of general splicing and polyadenylation factors but also to gene-specific splicing factors which can facilitate either the inclusive (dsx) or the skip-over (CGRP) splice. Calcitonin and the calcitonin gene-related peptide (CGRP) are produced from a single gene by alternative splicing or polyadenylation; the common exons 1-3 are spliced during processing for both, but inclusion of exon 4 in the final mRNA results in polyadenylation at a site in its 3'-UTR (pA1, Fig. 1 C) to produce calcitonin. To produce CGRP the processing reaction skips over exon 4 but splices exon 3 to exon 5; this is followed by exon 5 to exon 6 joining and polyadenylation after exon 6 (pA2, Fig. 1 C; 119 ,120 ). Studies of mice with a calcitonin/CGRP transgene showed that calcitonin-specific inclusion and polyadenylation of exon 4 occurs in a variety of tissues, while CGRP expression (skipping) is limited almost exclusively to neuronal cells (121 ), suggesting that the calcitonin pattern is the default pathway and that neuronal cells must enhance the exon 3 to exon 5 splice. Neither differential mRNA half-lives nor changes in transcription termination sites can account for the tissue-specific differences in calcitonin/CGRP expression (reviewed in 22 ).

Inclusion of exon 4, with its weak splice site and poly(A) site, to generate calcitonin mRNA in HeLa cells was shown to require an enhancer sequence located within the intron downstream of the poly(A) site of exon 4 distinct from the typical downstream GU- or U-rich elements (123 ). The intron enhancer activates cleavage and polyadenylation of precursor RNAs containing the calcitonin poly(A) site or heterologous poly(A) sites in exon 4 at a distance of several hundred nucleotides from the AAUAAA (124 ). The enhancer can work with a heterologous gene and contains (Py)nCAGGUAAGAC, a so-called `zero length' exon, composed of adjacent 3' and 5' splice site consensus elements preceded by a pyrimidine tract; the zero length exon can bind U1 snRNP, alternative splicing factor/splicing factor 2 (ASF/SF2) and pyrimidine tract binding protein. The enhancer of exon 4 inclusion activates polyadenylation and cleavage through binding of known splicing factors.

When the calcitonin/CGRP gene was transfected into a B cell line in which the amount of Ig secretory- and membrane-specific species was about equal, an accumulation of large amounts of partially processed nuclear species was seen. This was originally interpreted as indicating that the machinery necessary to splice exon 3 and exon 5 was missing but failed to explain why exon 4 was not used (125 ). A possible interpretation of these older results with B cell transfections is that the intron 4 enhancer was not active in those cells because of low precursor RNA binding activity of CstF-64 which also influences Ig processing or because of low levels of other unspecified RNA processing factors which interact with the intron enhancer.

The doublesex (dsx) gene in Drosophila (Table 3 and Fig. 1 C) shows skipping of exon 4 with splicing of exons 1, 2, 3, 5 and 6 in males but splicing of exons 1, 2, 3 and 4 in females (126 ). Polyadenylation occurs after exon 6 (pA2) in males and after exon 4 (pA1) in females. The tra and tra-2 gene products are required for female-specific processing of dsx pre-mRNA, with the male pattern representing the default pathway in the absence of these two genes. Binding of the tra-2 protein product to a region within the female-specific exon is required not only to activate splicing at the weak splice sites in exon 4, but also independently for female-specific polyadenylation at exon 4 (127 ). Highly cooperative interactions between domains of tra and tra-2 and serine/arginine-rich proteins result in formation of a multiprotein complex on the female-specific exon to positively enhance its splicing and hence the choice of the poly(A) site at the end of exon 4 (128 ). The mechanism of enhancement of polyadenylation has not been elucidated; however, the observations that a mutation in a 3' splice site can negatively affect polyadenylation (129 ) and that an intron enhancer that binds splicing factors can stimulate polyadenylation (124 ) argue that at least part of the activation by tra-2 of dsx exon 4 polyadenylation is through an interaction between the splicing and polyadenylation machinery. Enhancement of both female-specific splicing and polyadenylation are therefore important for regulated expression of this gene.

VIRAL SYSTEMS AND CstF-64 POLYADENYLATION/ CLEAVAGE FACTOR

The adenovirus major late transcription unit can be alternatively processed during the viral infectious cycle (see Table 1 ). Polyadenylation has been shown to regulate L1 versus L3 mRNA production in adenovirus infection (24 ,130 -134 ). The promoter-proximal L1 poly(A) site is weaker than the promoter-distal L3 poly(A) site (135 ), analogous to the Ig secretory- and membrane-specific poly(A) site arrangement. The switch in adenovirus, however, is to predominant use of the stronger downstream L3 poly(A) site late in infection. The regulation of poly(A) site use in adenovirus shows many similarities to that of the Ig transcription unit in cultured cells (see Table 4 ); there is a change in binding of the 64 kDa subunit of CstF to poly(A) sites with no change in the amount of 64 kDa protein (134 ). Late in adenovirus infection the activity of binding of the 64 kDa protein to poly(A) sites decreases, suggesting a decrease in overall polyadenylation efficiency (132 ); as a consequence, the stronger, promoter-distal poly(A) site is favored. In late stage/plasma cells the activity of binding of the 64 kDa subunit of CstF increases, implying a general increase in polyadenylation efficiency; consequently, the weaker, promoter-proximal poly(A) site is favored. The change in CstF activity during adenovirus infection indicates that the Ig gene model for poly(A) site choice may have broader relevance.

Polyadenylation efficiency changes throughout the course of herpes simplex virus type 1 (HSV-1) infection (136 -143 ). Several HSV genes themselves contain multiple poly(A) sites (see Table 1 ). HSV-1 gene expression is temporally regulated during lytic infection, with immediate early gene products being produced directly after infection in the absence of de novo protein synthesis. Immediate early gene expression is required to produce the early viral proteins which turn on viral DNA synthesis and transcription of the late gene products. ICP27 (also known as IE63) is a nuclear phosphoprotein required for viral replication and for the switch from early to late gene expression (139 -142 ). ICP27 functions post-transcriptionally to activate cleavage and polyadenylation of late, weaker viral poly(A) sites and to inhibit splicing of host cell pre-mRNA (139 ). Activation of the late, weak poly(A) sites is due to increased binding of the 64 kDa subunit of CstF to these sites in the presence of ICP27 (138 ). Strong poly(A) sites, defined by efficient cleavage in uninfected HeLa nuclear extracts, are not affected by the presence of ICP27. A direct interaction between ICP27 and the 64 kDa subunit of CstF has not been demonstrated and thus the precise mechanism by which ICP27 increases cleavage and polyadenylation at late viral sites is not known. The HSV-1 system, however, is another example in which changes in binding of the general polyadenylation factor CstF plays a role in a regulated switch in poly(A) site use.

The organization of the retrovirus HIV-1 with flanking long terminal repeats (LTRs), each of which contains a poly(A) site, requires the polyadenylation machinery to ignore the 5' poly(A) site close to the promoter and process only the 3' poly(A) site far downstream. Other retroviruses, like Rous sarcoma virus and T cell leukemia virus-1, have a transcription start site between the first AAUAAA and its downstream element, thereby precluding the problem. In HIV-1, the U3 element upstream of the 3' poly(A) site in the transcribed RNA has been shown to have an influence on enhancing processing in vitro and in vivo (144 ), a special example of the upstream elements mentioned earlier. In addition, the major splice donor site inhibits the adjacent 5'-LTR poly(A) signals (145 ). Therefore, use of the 3' poly(A) site in HIV-1 is through a combination of both enhancement of the active 3' site and depression of the 5' site.

SUMMARY AND CONCLUSIONS

Changes in the level or activity of the 64 kDa subunit of polyadenylation factor CstF can influence expression of viral and Ig heavy chain genes by changing the processing efficiency of weak poly(A) sites. A large number of other genes have multiple poly(A) sites, the use of which may vary in a differentiation or developmentally regulated fashion. The relative strengths of the poly(A) sites of many of these complex transcription units have yet to be determined and even less is known about potential positive and negative regulators of the cleavage/polyadenylation reaction. The evidence emerging from experiments in yeast suggests that other modifying factors influence the constitutive cleavage/polyadenylation machinery (146 ) beyond the well-established U1 snRNP protein A. Therefore, it is likely that there are tissue-specific levels of expression of basal polyadenylation/cleavage factors, as well as of modulators of the constitutive polyadenylation factors, in higher eukaryotes. If the levels of the constitutive cleavage/polyadenylation factors or modulators of them vary from tissue to tissue and throughout development, then differential use of multiple poly(A) sites can be achieved, providing `a means to an end' in complex transcription units. Tissue-specific variations in splicing factors can also tip the balance with some genes. Having the mRNA end well is a challenge to the cell. Characterizing RNA processing modulators and their interactions with constitutive polyadenylation and splicing factors to regulate alternative pre-mRNA processing remains a challenge to investigators in this field.

ACKNOWLEDGEMENTS

This work was supported by grant GM50145 to C.M.. K.L.V. is a member of the MD/PhD Program at the University of Pittsburgh. We thank Drs J.Cohen, S.Phillips and numerous colleagues for comments on the manuscript and useful discussions.

REFERENCES

1 Falck-Pedersen,E., Logan,J., Shenk,T. and Darnell,J.E. (1985) Cell, 40, 897-905. MEDLINE Abstract

2 Whitelaw,E. and Proudfoot,N. (1986) EMBO J., 5, 2915-2922. MEDLINE Abstract

3 Connelly,S. and Manley,J.L. (1988) Genes Dev., 2, 440-452. MEDLINE Abstract

4 Wickens,M. and Stephenson,P. (1984) Science, 226, 1045-1051. MEDLINE Abstract

5 Eckner,R., Ellmeier,W. and Birnstiel,M.L. (1991) EMBO J., 10, 3513-3522. MEDLINE Abstract

6 Huang,Y. and Carmichael,G.G. (1996) Mol. Cell. Biol., 16, 1534-1542. MEDLINE Abstract

7 Huang,Y. and Carmichael,G.G. (1996) Mol. Cell. Biol., 16, 6046-6054. MEDLINE Abstract

8 McCracken,S., Fong,N., Yankulov,K., Ballantyne,S., Pan,G., Greenblatt,J., Patterson,S.D., Wickens,M. and Bentley,D.L. (1997) Nature, 385, 357-361. MEDLINE Abstract

9 Edwalds-Gilbert,G., Prescott,J. and Falck-Pedersen,E. (1993) Mol. Cell. Biol., 13, 3472-3480. MEDLINE Abstract

10 Denome,R.M. and Cole,C.N. (1988) Mol. Cell. Biol., 8, 4829-4839. MEDLINE Abstract

11 Curtis,D., Lehmann,R. and Zamore,P.D. (1995) Cell, 81, 171-178. MEDLINE Abstract

12 Ross,J. (1995) Microbiol. Rev., 59, 423-450. MEDLINE Abstract

13 Sachs,A. and Wahle,E. (1993) J. Biol. Chem., 268, 22955-22958. MEDLINE Abstract

14 Ford,L.P., Bagga,P.S. and Wilusz,J. (1997) Mol. Cell. Biol., 17, 398-406. MEDLINE Abstract

15 Chabot,B. (1996) Trends Genet., 12, 472-478. MEDLINE Abstract

16 Sheets,M.D., Ogg,S.C. and Wickens,M.P. (1990) Nucleic Acids Res., 18, 5799-5805. MEDLINE Abstract

17 Gil,A. and Proudfoot,N.J. (1984) Nature, 312, 473-474. MEDLINE Abstract

18 Hart,R.P., McDevitt,M.A., Ali,H. and Nevins,J.R. (1985) Mol. Cell. Biol., 5, 2975-2983. MEDLINE Abstract

19 McDevitt,M.A., Hart,R.P., Wong,W.W. and Nevins,J.R. (1986) EMBO J., 5, 2907-2913. MEDLINE Abstract

20 Sadofsky,M. and Alwine,J.C. (1984) Mol. Cell. Biol., 4, 1460-1468. MEDLINE Abstract

21 Chen,F., MacDonald,C.C. and Wilusz,J. (1995) Nucleic Acids Res., 23, 2614-2620. MEDLINE Abstract

22 Brown,P.H., Tiley,L.S. and Cullen,B.R. (1991) Genes Dev., 5, 1277-1284. MEDLINE Abstract

23 Carswell,S. and Alwine,J.C. (1989) Mol. Cell. Biol., 9, 4248-4258. MEDLINE Abstract

24 DeZazzo,J.D. and Imperiale,M.J. (1989) Mol. Cell. Biol., 9, 4951-4961. MEDLINE Abstract

25 Gilmartin,G.M., Fleming,E.S. and Oetjen,J. (1993) EMBO J., 12, 4419-4428. 26 Prescott,J.C. and Falck-Pedersen,E. (1994) Mol. Cell. Biol., 14, 4682-4693. 27 Russnak,R. and Ganem,D. (1990) Genes Dev., 4, 764-776. MEDLINE Abstract

28 Valsamakis,A., Zeichner,S., Carswell,S. and Alwine,J.C. (1991) Proc. Natl. Acad. Sci. USA, 88, 2108-2112. MEDLINE Abstract

29 Miller,J.T. and Stoltzfus,C.M. (1992) J. Virol., 66, 4242-4251. MEDLINE Abstract

30 Kurkulos,M., Weinberg,J.M., Pepling,M.E. and Mount,S.M. (1991) Proc. Natl. Acad. Sci. USA, 88, 3038-3042. MEDLINE Abstract

31 Lutz,C.S. and Alwine,J.C. (1994) Genes Dev., 8, 576-586. MEDLINE Abstract

32 Lutz,C., Murthy,K., Schek,N., O'Conner,J., Manley,J. and Alwine,J. (1996) Genes Dev., 10, 325-337. MEDLINE Abstract

33 Moreira,A., Wollerton,M., Monks,J. and Proudfoot,N.J. (1995) EMBO J., 14, 3809-3819. MEDLINE Abstract

34 Hall,B.L. and Milcarek,C. (1989) Mol. Immunol., 26, 819-826. 35 Flaspohler,J.A., Boczkowski,D., Hall,B.L. and Milcarek,C. (1995) J. Biol. Chem., 270, 11903-11. MEDLINE Abstract

36 Wahle,E. and Keller,W. (1992) Annu. Rev. Biochem., 61, 419-440. MEDLINE Abstract

37 Keller,W. (1995) Cell, 81, 829-832. MEDLINE Abstract

38 Manley,J.L. (1995) Curr. Opin. Genet. Dev., 5, 222-228. MEDLINE Abstract

39 Proudfoot,N. (1996) Cell, 87, 779-781. MEDLINE Abstract

40 Bienroth,S., Wahle,E., Suter-Crazzolara,C. and Keller,W. (1991) J. Biol. Chem., 266, 19768-19776. MEDLINE Abstract

41 Gilmartin,G. and Nevins,J. (1989) Genes Dev., 3, 2180-2189. MEDLINE Abstract

42 Murthy,K. and Manley,J. (1992) J. Biol. Chem., 267, 14804-14811. MEDLINE Abstract

43 Gilmartin,G. and Nevins,J. (1991) Mol. Cell. Biol., 11, 2432-2438. MEDLINE Abstract

44 Christofori,G. and Keller,W. (1989) Mol. Cell. Biol., 9, 193-203. MEDLINE Abstract

45 Martin,G. and Keller,W. (1996) EMBO J., 15, 2593-2603. MEDLINE Abstract

46 Raabe,T., Bollum,F.J. and Manley,J.L. (1991) Nature, 353, 229-234. MEDLINE Abstract

47 Takagaki,Y., Ryner,L. and Manley,J. (1988) Cell, 52, 731-742. MEDLINE Abstract

48 Colgan,D.F., Murthy,K.G.K., Prives,C. and Manley,J.L. (1996) Nature, 384, 282-285. MEDLINE Abstract

49 Christofori,G. and Keller,W. (1988) Cell, 54, 875-889. MEDLINE Abstract

50 Ruegsegger,U., Beyer,K. and Keller,W. (1996) J. Biol. Chem., 271, 6107-6113. MEDLINE Abstract

51 Wahle,E. (1991) Cell, 66, 759-768. MEDLINE Abstract

52 Manley,J.L. and Takagaki,Y. (1996) Science, 274, 1481-1482. MEDLINE Abstract

53 Keller,W., Bienroth,S., Lang,K.M. and Christofori,G. (1991) EMBO J., 10, 4241-4249. MEDLINE Abstract

54 Jenny,A., Hauri,H. and Keller,W. (1994) Mol. Cell. Biol., 14, 8183-8190. MEDLINE Abstract

55 MacDonald,C.C., Wilusz,J. and Shenk,T. (1994) Mol. Cell. Biol., 14, 6647-6654. MEDLINE Abstract

56 Weiss,E.A., Gilmartin,G.M. and Nevins,J.R. (1991) EMBO J., 10, 215-219. MEDLINE Abstract

57 Wilusz,J. and Shenk,T. (1990) Mol. Cell. Biol., 10, 6397-6407. MEDLINE Abstract

58 Takagaki,Y., MacDonald,C.C., Shenk,T. and Manley,J.L. (1992) Proc. Natl. Acad. Sci. USA, 89, 1403-1407. MEDLINE Abstract

59 Takagaki,Y. and Manley,J.L. (1992) J. Biol. Chem., 267, 23471-23474. MEDLINE Abstract

60 Takagaki,Y. and Manley,J. (1994) Nature, 372, 471-474. MEDLINE Abstract

61 Gunderson,S.I., Vagner,S., Polycarpou-Schwartz,M. and Mattaj,I.W. (1997) Genes Dev., 11, 761-773. MEDLINE Abstract

62 Berget,S. (1995) J. Biol. Chem., 270, 2411-2414. MEDLINE Abstract

63 Gunderson,S., Beyer,K., Martin,G., Keller,W., Boelens,W. and Mattaj,I. (1994) Cell, 76, 531-541. MEDLINE Abstract

64 Wassarman,K.M. and Steitz,J.A. (1993) Genes Dev., 7, 647-659. MEDLINE Abstract

65 Frayne,E.G., Leys,E.J., Crouse,G.F., Hook,A.G. and Kellems,R.E. (1984) Mol. Ce