ABSTRACT
The epidermal growth factor receptor (EGFR) is encoded by the c-
erbB
1 proto-oncogene and plays an important role in the control of cell growth and
differentiation. To study the potential growth regulatory role of soluble EGF receptors, we have isolated cDNA clones encoding a truncated, secreted form of the human EGFR. The 5
'
sequence of this cDNA is identical to the EGFR transcript encoding the full-length receptor through exon 10. The unique 3'
sequence encodes two additional amino acid residues before encountering an in-frame stop codon, a poly(A) addition site and a poly(A)
+
tail. Sequence comparison with genomic DNA sequences demonstrates that this
alternative transcript arises by read-through of a splice donor site. As a result, this transcript encodes a
portion of the extracellular ligand-binding domain, but lacks the transmembrane domain and the intracellular
tyrosine kinase catalytic domain present in the EGFR. Conditioned medium from transfected fibroblast cells contains a 60 kDa protein that is specifically
immunoprecipitated by an EGFR monoclonal antibody. These findings demonstrate
that alternative processing of the human EGFR transcript produces a secreted
product composed of only the extracellular ligand-binding domain.
The epidermal growth factor receptor (EGFR) plays an important role in the control of cell growth and differentiation. Understanding the function of this receptor in tumorigenesis is of great interest because the
overexpression of the EGFR in human carcinomas is frequently associated with a poor prognosis. The EGFR is encoded by the c-
erb
B1 proto-oncogene (
1
,
2
) and is structurally related to three receptor tyrosine kinases, known as
ErbB2/Neu (
3
), ErbB3 (
4
,
5
) and ErbB4 (
6
). These receptors are encoded by distinct genes and together, make up the c-
erb
B family of proto-oncogenes.
The EGFR includes three functional domains: an extracellular ligand-binding domain, a transmembrane domain, and a cytoplasmic tyrosine kinase
domain. The extracellular domain can be further divided into four subdomains (I-IV), including two cysteine-rich regions (II and IV) and two regions (I and III) involved in
ligand-binding (
7
,
8
).
The 170 kDa human EGFR is encoded by two major transcripts of 5.8 and 10.5 kb (
2
). Additional alternatively spliced transcripts of approximately 2.6-2.7 kb have been identified in normal chicken and rat tissues; these transcripts encode secreted, truncated receptors containing only the extracellular ligand-binding domain (
9
,
10
). Furthermore, soluble EGF receptors occasionally arise from aberrant transcripts, as exemplified by the epidermoid carcinoma cell line,
A431 (
2
). In A431 cells, the EGFR gene is amplified and rearranged, and a 2.8 kb
transcript results from a translocation between the 5' region of the EGFR gene and an unidentified region of genomic DNA (
2
,
11
,
12
).
Soluble truncated receptors lacking their transmembrane and cytoplasmic domains
have also been reported for ErbB2 and ErbB3 (
13
,
14
). Moreover, many transmembrane growth factor and cytokine receptors have been
reported to have analogous soluble, ligand-binding receptor forms detectable in the culture supernates of cell lines,
and in biological fluids such as serum and urine (
15
). The widespread distribution of soluble receptors suggests that these molecules may have important physiological roles.
Our laboratory was involved in the initial discovery and characterization of the
soluble truncated form of avian ErbB1 (
9
), which was subsequently demonstrated to have growth inhibitory potential
in vitro
(
16
). To study the potential growth regulatory role of soluble EGF receptors in
human carcinomas, we have isolated cDNA clones encoding a truncated, secreted
form of the human EGF receptor.
The EGFR cDNA clone, pXER, was provided by G. Gill (
17
,
18
). Monoclonal antibodies which specifically recognize the extracellular domain
of the EGFR were as follows: R1 (Amersham RPN.513), LA1 and LA22 (Upstate
Biotechnology Inc. 05-101 and 05-104), and 528 and 225 (Oncogene Sciences, Ab-1 and Ab-5).
A 1.9 kb long cDNA probe corresponding to the ligand-binding domain (LBD) of the EGFR was synthesized by the polymerase chain
reaction (PCR) from pXER. The forward primer was: 5'-TCGGGGAGCAGCGATGCGAC-3', corresponding to bp 174-193. The reverse primer had the sequence 5'-CCATTCGTTGGACAGCCTTC-3' representing bp 1986-2105.
Nucleotide numbering is according to Ullrich
et al.
(
2
) unless stated otherwise. Amplification was performed for 35 cycles (94oC 1 min, 65oC 1 min, 72oC 3 min) with a final extension at 72oC for 10 min. A 768 bp
Eco
RI fragment from pXER was gel purified and used as the intracellular kinase
domain (KD) probe. The LBD and KD probes were radiolabeled with [[alpha]-
32
P]dCTP using a random primer DNA labeling kit (Gibco BRL) according to the
manufacturer's instructions.
A human placenta cDNA library (Clontech, catalog # HL1144x) was screened for
clones encoding only the ligand-binding domain of the EGFR. Duplicate nitrocellulose filters containing
640 000 recombinant phage were hybridized separately with radiolabeled LBD or
KD probes. The hybridizations were performed in a solution containing 6* SSC, 5* Denhardt's, 7.5% dextran sulfate, 0.5% N-lauryl sarcosine, and 100 [mu]g/ml salmon sperm DNA at 65oC. Filters were washed in 0.1* SSC and 0.1% N-lauryl sarcosine at 65oC and were exposed 24-72 h at -80oC with an
intensifying screen. Plasmid DNA was released from the p[lambda]DR2 vector by site-specific recombination using the CRE-lox system (
19
). Clones were sequenced on both strands using the Taq DyeDeoxy cycle sequencing
kit and the Applied Biosystems model 373A automated DNA sequencer. In GC-rich regions of the templates, 1 [mu]l dimethyl sulfoxide was added to the sequencing reactions.
Intron 10 of the EGFR was amplified by PCR from human genomic DNA. The forward
primer was EX10F: 5'-TGACTCCTTCACACATACTC-3', corresponding to bp 1320-1339 in exon 10. The reverse primer had the
sequence, EX11R: 5'-TTCTCAAAGGCATGGAGGTC-3', representing bp 1432-1451 in exon 11. Human DNA (50 ng) was mixed with 20 pmol of
each primer, 100 [mu]M of each deoxynucleotide, 2.5 U
Taq
polymerase (Boehringer Mannheim Biochemicals), and 5 [mu]l of 10 * buffer (supplied with the
Taq
) in a total volume of 50 [mu]l. Amplification was performed for 35 cycles (94oC 1 min, 65oC 1 min, 72oC 2 min) with a final extension at 72oC for 10 min. The PCR product was ligated into a TA-cloning vector (Invitrogen). Plasmid DNA was
isolated from two independent colonies and was sequenced as above. In addition,
PCR primers were designed from the flanking exon and divergent sequences in
clones 281, 721, 713, 711, and 152. The oligonucleotide sequences are available
upon request.
Total cellular RNA was isolated from the human placental cell line, 3A-Sub-E (ATCC#: CRL 1584), by the guanidine isothiocyanate extraction procedure (
20
). Prior to the reverse transcription reaction, the RNA was treated with RNase-free DNase I (Boehringer Mannheim Biochemicals) and extracted twice with an equal mixture of phenol and chloroform. RNA (1 [mu]g) was heat denatured at 90oC for 5 min and then reverse transcribed in a 20 [mu]l reaction mixture (1* AMV reaction buffer, 1 mM each dNTP, 10 mM
dithiothreitol, 20 U RNasin, 10 U AMV reverse transcriptase) using 0.1 [mu]g of oligo-dT primer for 1 cycle of 24oC 10 min, 42oC 50 min, 99oC 5 min, and 4oC 5 min.
Taq
DNA polymerase and 10 pmol of the forward primer EX10F and of either reverse primer, EX11R or P161R (5'-CCAAGGGAACAGGAAATATG-3'), were added to final volume of 100 [mu]l. Amplification was performed as described above.
Products were analyzed after electrophoresis on a 5% polyacrylamide gel and
staining with ethidium bromide.
The quail fibroblast cell line, QT6 (
21
), was maintained in Dulbecco's modified Eagle's medium (DMEM, Biowhittaker) containing 4.5 g of glucose per liter and supplemented with 5% fetal calf serum (FCS) and 1% chick serum. Cells were transfected transiently with 15
[mu]g of the expression vector pDR2 containing cDNA 161 (pDR161) by the calcium
phosphate precipitation technique as described previously (
22
).
Transfected cells from two 100 mm plates were pooled and replated in 6-well plates approximately 48 h post-transfection. The following day, cells were rinsed once in phosphate
buffered saline (PBS) and labeled in methionine-free DMEM supplemented with 5% dialyzed FCS and 150 [mu]Ci/ml of [
35
S]methionine (Promix, Amersham) at 37oC for 12 h. Conditioned medium from labeled cells was collected and
centrifuged briefly to remove loose cells and debris and phenylmethylsulfonyl
fluoride (PMSF) and aprotinin were added to a final concentration of 1 mM and 50 [mu]g/ml. Cell monolayers were lysed and immunoprecipitated with the addition of 1-5 [mu]g of monoclonal antibody as described previously (
23
). Samples were resuspended in 2* Laemmli sample buffer (125 mM Tris-HCl, pH 6.8, 4% SDS, 20% glycerol, 10% 2-mercaptoethanol, 2 mM EDTA, 0.04% bromophenol blue), boiled
for 5 min and separated by 10% SDS-PAGE. Gels were stained with Coomassie blue, treated with EnHance (Dupont) and dried before an overnight exposure to film.
We and others have observed a 1.8 kb transcript in human placental RNA that hybridizes exclusively to an EGFR extracellular domain probe (
2
,
24
, and data not shown). These results suggested that alternative transcripts
encoding only the extracellular ligand-binding domain of the human EGFR might exist. Therefore, we used differential hybridization to screen an oligo-dT primed human placental cDNA library for clones that were positive
for a ligand-binding domain (LBD) specific probe, but negative for a kinase domain (KD) probe. Eleven clones hybridizing exclusively to the LBD probe were purified from approximately 6 * 10
5
plaques (Fig.
1
).
Five clones contained sequences identical to the EGFR coding region through exon
10 (Fig.
1
). Sequence analysis revealed that clone 161 contained a 1593 bp insert that
contained 244 bp of 5' untranslated region and a reading frame that codes for 381 amino acids. Clones 161 and 763 were nearly identical in sequence except that clone 763 contained one additional nucleotide (C) at the 5' end and had a much longer poly(A)+ tail, apparently added 13 bp
upstream of the cleavage site in clone 161. The unique sequence at the 3' ends of clones 161, 763, 801, 681, and 281 exhibited an in-frame termination codon (TGA) and an AATAAA sequence, followed by a
poly(A)+ tail. The divergent region begins with the sequence GTTTG, which
contains the highly conserved GT dinucleotide and a G at the +5 position in the
consensus splice donor site. Thus, it appears that the truncated transcript
fails to splice at the 3' end of exon 10 and reads through the intron for 117 bp to an alternative poly(A) addition site.
Six clones contained sequence insertions at various locations (Fig.
1
). Sequence analysis of these clones revealed the presence of consensus splice
donor or acceptor sites located between the unique sequence and sequence
identical to the EGFR cDNA (Table
1
). Comparison of these putative intron sequences with GenBank revealed no similarity to any known sequences. In contrast, the unique 3' sequence in clone 152 did not diverge at a predicted exon-intron junction. Sequence comparison of this peculiar 40 bp with
GenBank sequences revealed complete identity to the chorionic somatomammotropin
hormone (CSH-1) gene, which is expressed at very high levels in placenta (
25
). Similarly, clone 701 apparently resulted from a rearrangement between the
sequence 5'-CCTTTGAG-3' located at nucleotides 1442-1449 in the extracellular region and at 4062-4069 in the 3' untranslated region of the 5.8
kb EGFR cDNA (
2
), deleting the intervening region.
Table 1
Clones 711 and 721 were apparently oligo-dT primed from a poly(A) sequence associated with an
Alu
repetitive element in intron 15, while clones 713 and 714 also contain a
stretch of 15 As at their 3' ends. Clones 714, 721, 711, and 713 do not contain an AATAAA sequence
upstream of their poly(A)
+
sequences, suggesting that they were oligo-dT primed from internal poly(A) tracts present in the intervening
sequences. In addition, none of these clones contained open reading frames that
extended into the intron sequences. We conclude that these clones are simply artifacts of the cloning process and were derived presumably from incompletely processed
transcripts that were reverse-transcribed during the construction of the cDNA library. Therefore, they do not represent additional alternatively processed transcripts.
The sequence of cDNA 161 corresponding to the 5' untranslated region and exons 1-10 was compared with published sequences of the EGFR cDNA.
Overall, the sequences were identical. However, nucleotide differences were
observed at 15 locations, 14 of which occurred in the G-C rich, 5' untranslated region (Fig.
2
). It is unclear if any of these base pair differences might affect promoter
function. They do not appear to disrupt transcription factor Sp1 binding sites
nor are they located near mapped transcription start sites (
26
-
28
). One nucleotide difference unique to cDNA 161 occurred in codon 134, a C-T change in the third position, which did not change the encoded amino
acid (data not shown). These sequence variations may have resulted from cDNA
cloning artifacts, difficulties in sequencing G-C rich regions, or they may represent sequence polymorphisms.
To determine whether the divergent cDNA sequences were contiguous in the genome
with the flanking exons, human DNA was amplified by PCR using primers specific
for the flanking exon and the unique sequences. Primers specific for exon 10
(EX10F) and the 3' sequence of the truncated EGFR transcript (P161R) amplified a 159 bp
product in both human DNA and cDNA 161 (data not shown). These results were
consistent with the read-through of a 5' splice donor site as the source of the unique sequence. To further
confirm that the 3' sequence of cDNA 161 was derived from intron 10, we amplified genomic
DNA with primers specific for exons 10 and 11. The 962 bp PCR product was
cloned and sequenced. Comparison of intron 10 and cDNA 161 sequences showed
complete homology until the poly(A) addition site, verifying the read-through of the 5' splice site as the origin of the novel sequence (Fig.
3
).
Our initial attempts to use a 3' specific probe from cDNA 161 did not reveal a detectable signal on
northern blot analysis because of the relatively low abundance of the 1.8 kb
transcript, as well as the difficulty in generating a small probe (~70 bp) with high specific activity. To determine if the transcript
represented by cDNA 161 is expressed in human placenta, we developed an RNA-based PCR assay. RNA from a human placental cell line was reverse
transcribed with an oligo-dT primer, and the first strand cDNA was amplified using primers specific
for EGFR exon 10 and 11 or with primers specific for exon 10 and the 3' unique sequence of cDNA 161. Specific products of the predicted size
(132 bp and 159 bp respectively) were obtained, while no products were observed
when the reverse transcriptase was omitted (Fig.
3
C).
The amino acid sequence deduced from cDNA 161 predicted a 381 amino acid protein with a calculated molecular mass of 44 661 daltons. The first 24 amino acids code for a signal peptide; following
cleavage by signal peptidases, the predicted molecular weight of this protein
would be 42 396. The sequence encodes subdomains 1, 2 and a portion of
subdomain 3 of the extracellular ligand-binding domain of the EGFR, and also retains six of 12 potential N-linked glycosylation sites (Asn, X, Ser/Thr). The final two
residues, leucine and serine, are unique to this molecule and are followed by
an in-frame termination codon (TGA), nine nucleotides downstream of the point of
divergence with the EGFR cDNA. As a result, the predicted product lacks the
transmembrane domain and the intracellular tyrosine kinase catalytic domain
present in the EGFR. We have named this product soluble ErbB1 (ErbB1-S) since it is structurally related to the avian c-
erb
B1 soluble product and it is not yet known if this truncated receptor is able to
bind to EGF.
Because clone 161 contains a signal peptide, but lacks the transmembrane domain,
we predicted that the protein encoded by this cDNA should be secreted. To test
this hypothesis, a quail fibroblast cell line, QT6, was transfected with the pDR2 mammalian expression vector containing clone 161 (pDR161) under the control of the Rous Sarcoma Virus LTR (
19
). Cells and conditioned media were subsequently analyzed for expression of this
truncated protein by immunoprecipitation with monoclonal antibodies directed against the extracellular domain of the EGFR. Immunoprecipitation of mock-transfected cells failed to reveal a specific EGFR related protein in either cell lysates or in conditioned media, while a 115 kDa soluble, truncated EGFR was immunoprecipitated from the media of control A431 cells (Fig.
4
A). As predicted, immunoprecipitation of conditioned media from transfected cells revealed a heterogeneous 55-65 kDa species that was specifically recognized by the EGFR monoclonal antibody, R1 (
31
).
In this study, we report the molecular cloning of a cDNA clone encoding a
truncated, secreted form of the human EGFR. This cDNA clone contains a 1.6 kb
insert and assuming a poly(A) tail of ~200 bp, is of a size consistent with a 1.8 kb transcript from the human
EGFR gene. We further demonstrate that this transcript is expressed in human
placenta and that it arises by the read-through of a splice donor site and the use of an alternative poly(A)
addition signal located in intron 10. In addition, translation of this cDNA in
transfected fibroblasts produces a secreted 60 kDa protein that can be
immunoprecipitated with a monoclonal antibody specific for the extracellular
domain of the EGFR.
It is not known what factors might be involved in the generation of variant EGFR
transcripts. Conceivably, proteins may be involved in promoting or inhibiting
either the splicing or the cleavage-polyadenylation reactions. Another possibility is that the variant
transcripts may initiate at different transcription start sites. The human EGFR
promoter region does not possess typical TATA or CAAT boxes, and RNA
transcription has been shown previously to initiate at multiple sites (
27
,
28
). The 5' sequence of cDNA 161 extends 244 nucleotides from the translation start
site, suggesting that this transcript was initiated from the major
in vivo
start site located at position -255 (
27
). Thus, the selective use of polyadenylation signals in the expression of ErbB1-S does not appear to be associated with the differential use of
transcription start sites.
Human ErbB1-S contains extracellular domains I, II, and the amino-terminal half of subdomain III in the EGFR extracellular domain
(Fig.
5
A) and resembles the secreted receptor encoded by the avian 2.6 kb c-
erb
B1 transcript (Fig.
5
B). However, human ErbB1-S is structurally distinct from the truncated EGFR produced in normal rats
and the human A431 carcinoma cell line. Both the rat and the A431 truncated
receptors contain subdomain IV and diverge from the full-length receptor five amino acids upstream of the transmembrane domain.
We gratefully acknowledge A. Lampland for assistance with the RT-PCR assay. This work was supported by NIH grants CA09441 and CA68747 to
J.L.R. and CA57534 to N.J.M. and by the Mayo Foundation.
REFERENCES
Return




