ABSTRACT
Transcription factors are modular entities built up of discrete domains, some devoted to DNA binding and others permitting transcriptional modulation. The structure of DNA binding domains has been thoroughly investigated and structural classes clearly defined. In sharp contrast, the structural constraints put on transactivating regions, if any, are mostly unknown. Our investigations focus on ERM, a eukaryotic transcription factor of the ETS family. We have previously shown that ERM harbours two transactivating domains (TADs) with distinct functional features: AD1 lies in the first 72 amino acids of ERM, while AD2 sits in the last 62. Here we show that AD1 is a bona fide acidic TAD, for it activated transcription in yeast cells, while AD2 did not. AD1 contains a 20 amino acid stretch predicted to form an [alpha]-helix that is found unchanged in the related PEA3 and ER81 transcription factors. Circular dichroism analysis revealed that a 32 amino acid peptide encompassing this region is unstructured in water but folds into a helix when the hydrophobic solvent trifluoroethanol is added. The isolated helix was sufficient to activate transcription and mutations predicted to disrupt it dramatically affected AD1-driven transactivation, whereas mutations decreasing its acidity had more gentle effects. A phenylalanine residue within the helix was particularly sensitive to mutations. Finally, we observed that ERM bound TAFII60 via AD1 and bound TBP and TAFII40, presumably via other activation domains.
The ETS family is a set of eukaryotic genes encoding transcription factors with a conserved DNA binding region, named the ETS domain (1 ,2 ). ETS genes are present throughout the metazoans (3 ) and play key developmental roles. For instance, pointed and yan control eye development in Drosophila (4 -6 ), while in the mouse B and T lymphocytes fail to develop in the absence of a functional ets1 (7 ).
Besides their physiological roles, genes of the ETS family can also be involved in oncogenesis. Indeed, v-ets is one of the oncogenes responsible for the transforming properties of E26 chicken leukaemia virus (8 ,9 ). Some other ETS genes, including fli (10 ), pu1 (11 ) and tel (12 ), are also proto-oncogenes, for their misregulation can lead to cell transformation (see 13 for a review).
Regulation of the activity of ETS transcription factors has in the recent past become clearer and some of them are now inserted in well-defined signal transduction pathways. A series of genetic and biochemical investigations have for instance exposed the Pointed(4 ,14 ), Ets1 and Ets2 (15 ) proteins as downstream effectors of the Ras/Raf/ERK pathway. Another ETS gene product, Elk, associates with SRF to manage serum induction of c-fos (see 16 for a review). Elk can be activated by Ras through the ERK kinases, but is also targeted by Rac and CDC42 via the JNK kinases (17 ,18 ). In contrast to the wealth of information regarding the way ETS genes are activated or inactivated, little is known on how they achieve their physiological function, i.e. how they modulate transcription.
We have focused our interest on the PEA3 subgroup of the ETS family. To date it contains three genes with high sequence similarity: PEA3 (19 ), ER81 (20 ) and ERM (21 ).
The members of the PEA3 group appear to be involved in different aspects of cancer. ER81 is found fused to the EWS protein as a result of a (7;22)(p22;q12) chromosomal translocation in some cases of Ewing's sarcomas (22 ) and thus is a bona fide proto-oncogene. Several lines of evidence point to PEA3 as a factor in the metastastatic process; in both neu-induced mammary tumours (23 ) and squamous cell carcinoma cell lines (24 ) its expression correlates with metastatic potential. Furthermore, it can stimulate transcription of several genes encoding matrix metalloproteinases (25 ) and its expression is sufficient to turn a non-invasive breast cancer cell line into an invasive one (26 ).
We had previously engaged in a study of the transactivation properties of ERM and found that it had two transactivating domains (TADs): AD1 (formerly [alpha]) lies in the first 72 amino acids of ERM and AD2 in the last 62 (27 ). The two TADs are functionally dissimilar and exhibit a synergistic effect on transcription.
AD1 (also described in 28 ) seems to have been conserved within the PEA3 group. Indeed, the homologous regions of ER81 and PEA3 can also activate transcription (29 ; J.H.Chen, personal communication). This functional conservation, presumably reflecting the physiological importance of AD1, prompted us to investigate its characteristics further.
Here we show that AD1 has functional features of the `acidic' class of TADs, which, for example, contains the TADs from herpes virus VP16 protein and the yeast regulators GAL4 and GCN4. However, we demonstrate that, unlike the VP16 TAD, which seems to be unstructured (30 ,31 ), or those of GAL4 and GCN4, which could adopt a [beta]-sheet conformation (32 ,33 ), AD1 depends for function on a 15 amino acid [alpha]-helix. This helix, fully conserved in ERM, ER81 and PEA3, requires a proper protein environment for folding and function and constitutes a single exon, raising the possibility that it has been inserted into an ERM/ER81/PEA3 precursor gene by exon shuffling. In addition, we show that ERM binds TAFII60 via AD1 and that other regions of ERM bind TBP and TAFII40.
To ease construction of GAL4(1-147) chimeras we engineered a new vector we call pGAP. It has a pSG5 backbone and promoter, an IRES sequence to increase mRNA translatability, the GAL4(1-147) coding sequence and an extended cloning linker.
pSG424AD1F47P was obtained by PCR of pSG424AD1 (formerly pSG424[alpha]; 27 ) with mutant primers changing a TTT codon to CCT. AD1, AD1F47P and AD2 were recloned from pSG424 into pGAP424 as EcoRI-XbaI restriction fragments. The helix within AD1 of ERM was cloned by PCR. Mutagenesis of pGAPAD1 was performed using the Chameleon double-strand mutagenesis kit (Stratagene). The unrelated linker peptide QSRAATAVELGTRSY was introduced into the GAL4 helix by site-directed mutagenesis, which removed the first stop codon.
ERM and its mutant derivatives were cloned into pSV (a gift of Guillaume Adelmant; see 34 ), an expression vector containing a HA tag at the N-terminus. ERM was excised from pSG424ERM (27 ) and inserted into pSV as an EcoRI-XbaI partial restriction fragment. The F47P mutation was introduced into ERM by site-directed mutagenesis. The C-terminal activation domain AD2 was removed from ERM or ERMF47P by PCR, yielding pSVERM[Delta]AD2 and pSVERMF47P[Delta]AD2.
For expression in yeast cells AD1, AD1F47P and AD2 were fused to GAL4(1-147) in vector pGBT9 as EcoRI-PstI (for the first two constructs) or EcoRI-BamHI fragments (for the last).
The 5*GAL4 Luc plasmid containing five GAL4 binding sites upstream of the luciferase gene was a kind gift of Dr Jay Gralla (pG5TATA Luc in 35 ). The ICAM(-176/-44) Luc (27 ) and 3*TORU Luc (36 ) plasmids have been described.
All PCR and mutagenesis products were sequenced on an automated sequencer (Applied Biosystems 373A).
For in vitro interaction assays plasmids encoding the different basal machinery components were used. pXJ41-TBPh encoding TBP, pXJ41-hTAFII20 encoding TAFII20 and pSK(-)-TAFII30 encoding TAFII31 were kindly provided by Dr P.Chambon. p85-KS encoding TAFII85 was a gift from Dr Y.Nakatani. pT[beta]STOP-TAF40 encoding TAFII40 and pT[beta]-dTAF60 encoding TAFII60 were from Dr R.Tjian.
RK13, HeLa and HC11 cells were maintained in DMEM supplemented with 10% fetal bovine serum at 37oC in a water-saturated atmosphere with 5% CO2.
Sub-confluent cells were transfected in 6-well plates with lipofectamine (Boehringer Mannheim) as previously described (27 ). They were harvested 24 h after transfection and the luciferase activity measured using a Berthold LB 9501 chimio-luminometer. An RSVLacZ plasmid was co-transfected and the [beta]-galactosidase activity assessed (Tropix Galacto-Light kit) to normalize data for transfection efficiency.
Metabolic labelling of transiently transfected cells was performed 24 h after transfection, for 3 h in medium containing 100 µCi/ml [35S]Met/Cys. Immunoprecipitation was as described (37 ).
Strain Y187, which has an integrated LacZ gene downstream of GAL4 binding sites, was transformed by the PEG/LiAc method. Transformants were cultured in dropout medium lacking the appropriate amino acids, disrupted by shearing with beads as in Silverman et al. (38 ) and [beta]-galactosidase activity in the extract measured using the Tropix GalactoLight kit. It was normalized to the protein concentration in the extract. Protein extracts used to perform immunoblots were also prepared as in Silverman et al. (38 ).
Structure predictions were performed with the PHD program (39 ,40 ), available at predictprotein@embl-heidelberg.de.
Circular dichroic spectra were recorded on a Jobin-Yvon CD6 Dichrograph in a 1 mm path length quartz cell with a thermostated cell holder. The acidic domain peptide had the sequence GYDSEELFQDLSQLQETWLAEAQVPDNDEQFVPD. It was dissolved in aqueous solution with NaOH at pH 10.6, at a concentration of 1.4 × 10-5 M.
The effect of TFE titration was studied between 0 and 90% TFE (TFE volume/total volume) in aqueous alkaline solution at 20oC.
CD was expressed as mean residue ellipticity [[theta]] in deg cm2/dmol. [[theta]] is related to the dichroic increment [Delta][epsilon] as follows: [[theta]] = (3300 * [Delta][epsilon])/n, where n is the number of residues.
ERM12-510 and ERM12-510F47P coding sequences were subcloned in-frame into the pGEX vector (Pharmacia) and GST fusion proteins were expressed in Escherichia coli. GST and fusion proteins were immobilized on glutathione-Sepharose beads, washed in binding buffer (20 mM HEPES, pH 7.6, 75 mM NaCl, 2.5 mM MgCl2, 0.1 mM EDTA, 0.05% Triton X-100, 0.5 mM DTT, 0.1 mM PMSF) and incubated with [35S]methionine-labelled, in vitro translated proteins in binding buffer. Proteins were allowed to interact at 4oC for 60 min. Bound proteins were washed four times with binding buffer, eluted with SDS sample buffer and subjected to SDS-PAGE.
Thehelix we detected appeared as an autonomous structural unit. We tried to determine whether it was also functionally autonomous and constituted a TAD on its own. To that end the helical region of the acidic domain (amino acids 43-62) was fused to the GAL4(1-147) DNA binding domain, which itself has little transcriptional modulation activity (44 ). Transient co-transfections were then performed in mammalian cells to monitor the effect of this chimera on expression of a reporter gene bearing GAL4 operators in its promoter. For comparison purposes we performed the same experiments with chimeras including the full-size 72 amino acid AD1 or the 32 amino acid acidic domain (see Fig. 3 A).
Figure
As shown in Figure 3 B, the GAL4-helix fusion was found to activate transcription, albeit with a 3-fold lower efficiency than the complete acidic domain, itself being about 4-fold less efficient than the entire AD1 (a result shown in 27 and reproduced here for comparison). Note that the activity of GAL4-helix is nevertheless undoubtedly higher than that of the control GAL4(1-147).
We thought that the helix could merely require additional protein length on its C-terminal side for appropriate folding and thus added a linker sequence to our construct to reconstitute a peptide with a length similar to that of the acidic domain. This did not, however, increase the transactivating potency of the helix (compare the activity of GAL4-helix with that of GAL4-helix-linker). We thus conclude that the helix is an autonomous activation domain that cooperates with neighbouring sequences for maximum activity. As will be shown below, these cooperating sequences are not themselves capable of transactivation.
Though the helix was sufficient for activity, there was no proof that helical structure in the acidic domain was necessary for transactivation. We thus engineered in the GAL4-AD1 construct a set of mutations chosen to disturb or not the helix (Fig. 4 A) and assessed the activity of the mutants as described above.
Figure
As shown in Figure 4 B, we observed that both mutations predicted to impair formation of the [alpha]-helix (F47P and L50P) had very severe effects on GAL4-AD1 transactivation levels, rendering them merely superior to that of the control GAL4(1-147). An equivalent F -> P mutation introduced at residue 69, which lies outside the predicted helix and a mutation replacing acidic residues but compatible with helical structure, E44S/D49A, had, in contrast, quite moderate consequences: these mutant TADs are only 1/3 less active than their wild-type counterpart.
The F47L mutation, though predicted to spare the helix, had an effect as dramatic as helix-breaking mutations. F47 could nevertheless be replaced by a W residue. It thus appears that an aromatic residue within the helix is crucial, a situation reminiscent of that observed in RelA (45 ,46 ; see Discussion for further details).
The transcriptional defects in the mutant proteins were observed at all plasmid doses tested (not shown) and are not, for instance, due to intense self-squelching effects. We also verified the wild-type and mutant proteins to be expressed at comparable levels by immunoprecipitation (not shown).
We thus conclude that the presence of the helix is necessary for activation by GAL4-AD1 and also that regions of AD1 outside the helix do not on their own activate transcription.
AD1 of ERM does not lie in its native environment in the GAL4 chimera and the fusion could render it artificially sensitive to the introduced mutations. We thus tested whether one of the point mutations characterized above, F47P, had a comparable effect when inserted into the native ERM molecule (Fig. 5 A).
Figure
We have recently found that transcription from the ICAM-1 (intercellular adhesion molecule 1) gene promoter can be stimulated by ERM (de Launoit et al., submitted for publication). The (-176/-44) fragment of this promoter bears only one high affinity ETS site and was used to drive expression of the luciferase reporter gene. Transient co-transfections were performed in the HC11 mouse mammary cell line. Under these conditions we found ERM to drive a 35-fold increase in reporter gene activity over the control vector, while the potency of ERMF47P was ~4-fold lower than that of ERM (Fig. 5 B). We have previously shown that ERM contains a second TAD, AD2, in its C-terminal tail (27 ). To determine whether the residual activity of ERMF47P could be attributed to AD2, we removed AD2 from ERMF47P and found the corresponding construct ERMF47P[Delta]AD2 to be a very poor transactivator, with ~10% of the activity of ERM. Note that deletion of AD2 from ERM is not in itself sufficient to deprive it of transactivating ability (see lane ERM[Delta]AD2). These effects were not dose dependent (not shown).
Mutations in transactivating modules can sometimes be partly compensated for by intermolecular cooperation when several mutated molecules are simultaneously recruited to the promoter (47 ). To test if that applied to mutations in AD1 in the context of the full-size ERM, we assayed the activity of ERM and its derivatives on the 3*TORU Luc reporter plasmid, which harbours an artificial promoter with three ETS binding sites, in rabbit RK13 cells as previously described (27 ). Under these conditions transfection of ERM invoked a 35-fold increase in reporter gene activity, while the effect of ERMF47P was ~6-fold lower (Fig. 5 B). Activity of ERMF47P[Delta]AD2 was hardly detectable.
We believe that the F47P mutation selectively impairs transcriptional activation and not another functional feature of ERM, for we verified the mutant proteins: (i) to be expressed at a level similar to that of the wild-type (immunoprecipitations not shown); (ii) to be correctly addressed to the nucleus (immunofluorescence not shown); (iii) to have the same in vitro DNA binding capacity as the wild-type counterpart (EMSA not shown).
We thus conclude that a point mutation in the [alpha]-helix in AD1 of ERM is indeed sufficient to cripple its transactivation potential. This effect is not promoter- nor cell type-specific and cannot be overcome by intermolecular cooperation.
A hallmark of TADs of the acidic class, as opposed to the `glutamine-rich' ones for instance, seems to be their potential to activate transcription in yeast cells. Sequence observation could lead us to surmise that both AD1 and AD2 were TADs of the acidic type. To learn if this was functionally true, we tested whether GAL4-AD1 and GAL4-AD2 chimeras could increase expression of a LacZ gene containing GAL4 operators in its promoter in the yeast strain Y187, which lacks endogenous GAL4 (Fig. 6 ). We found that this was clearly the case for AD1. As observed in mammalian cells, introducing the F47P mutation into GAL4-AD1 deprived it of most of its transactivating ability. This supports the widely supported hypothesis that the mechanisms of transcriptional activation have been well conserved from yeast to mammalian cells (see for instance 48 ,49 ).
Figure
In contrast, AD2 proved not to achieve transcriptional stimulation under these conditions (Fig. 6 ) and is not functionally an acidic TAD. This was not due to poor expression, as all chimeric proteins were verified to be similarly expressed by immunoblotting (not shown).
This result furthers our previous observation that the two ERM TADs have dissimilar mechanisms of action, possibly explaining their synergistic effect.
We then sought to identify targets of the ERM activation domains within the transcription machinery. To this end we transcribed and translated different cDNAs encoding components of the TFIID complex in vitro and assayed their ability to interact with bacterially expressed ERM or derivatives thereof.
TAFII20, TAFII31 and TAFII85 failed to interact with ERM (Fig. 7 ).
Figure
TBP, TAFII40 and TAFII60 formed complexes with GST-ERM, but not with the GST moiety alone. We performed similar experiments with GST-ERM F47P, which bears a mutation crippling AD1. This single amino acid change left the interactions between ERM and TBP or TAFII40 unchanged, but abolished contacts between ERM and TAFII60.
We therefore conclude that AD1 directly contacts TAFII60. The ERM-TBP and ERM-TAFII40 interactions might be carried out via AD1 and not be affected by the F47P mutation or, more likely, involve other regions of ERM.
A region within amino acids 43-62 of ERM has the potential to structure into an [alpha]-helix (Figs 1 and 2 ) and the results of our mutagenesis studies clearly show that helical potential in this region is required for AD1 activity, negative charge being of lesser importance (Figs 4 and 5 ). Yet a peptide containing the potential helix is unstructured in water. Two interpretations could reconcile this finding with our model of a helical activation domain.
The first surmises that protein sequences present in native ERM, but absent from the peptide used in CD spectroscopy, are required for the helix to fold into its structure. Nevertheless, this is hardly compatible with the observation that the same peptide, when tethered to the GAL4 DNA binding domain, activates transcription, unless one assumes that the GAL4 protein itself induces structure in the helix.
Alternatively, it could be that the helix acquires structure by an `induced fit' mechanism. The transcriptional activator VP16 has been shown to invoke conformational changes in TFIIB upon contact (50 ); it is thus conceivable that, conversely, the activating helix folds when reaching its target. An example supporting this hypothesis comes from studies on RelA. RelA indeed contains a helical acidic TAD and a 32 amino acid peptide spanning this region is unstructured in water but folds when TFE is added (46 ). NMR proved this activating region also to be unstructured in water in the context of a 123 amino acid RelA fragment (46 ). Besides, a RelA C-terminal activation domain-derived peptide in water, though unstructured, can interact with its targets (45 ). Although more detailed studies would be necessary to ascertain this point, we believe a comparable situation occurs in AD1 of ERM: the helix may be unstructured at rest but fold upon meeting its interaction partner. Further support for this theory derives from a structural study of the acidic p53 activation domain. A peptide derived from this activation domain is mainly unstructured in solution, but turns into a helix upon contact with MDM2 (51 ).
The helix contained within exon 4 of ERM likely constitutes the active core of AD1. Indeed, a point mutation at amino acid 47 totally abrogates transactivation by AD1, implying that regions other than the helix are not on their own sufficient for transactivation (Figs 4 and 5 ). Nevertheless, it appears in the context of the GAL4 chimera that removing either the 1-42 or the 63-72 region notably reduces helix-driven transactivation (Fig. 3 ). How then could it be that these segments act as potentiators while being themselves inactive? Possibly the helix, when directly tethered to GAL4, is constrained in a non-optimal situation and the neighbouring regions merely act as inert linkers providing flexibility. We judge this unlikely, though, for a peptide as short as 15 amino acids has already been shown to activate transcription when tethered to GAL4(1-147) (52 ). Also, more conclusively, the sequences C-terminal of the helix could not be replaced by a peptide chain of similar length without a clear activity decrease (Fig. 3 ).
A recent report has provided a striking illustration that secondary structure does not only depend on amino acid composition, but also on neighbouring sequences (53 ). We thus favour the hypothesis that the regions flanking the helix provide a propitious environment for helix formation. This may in particular explain the high degree of sequence conservation between ERM, ER81 and PEA3 in the 63-72 region of the acidic domain and in the 30-41 region between ERM and ER81.
ETS family genes share a well-conserved DNA binding domain, but in contrast have evolved a variety of TADs (see 54 -58 for illustrations). A striking example is found in Ets1 and Ets2, which display 95% amino acid identity in the 85 amino acid ETS domain and, at least in vitro, bind to the same target sequences (59 ) but have unrelated TADs (54 ). Knocking out ets1 in the lymphocyte lineage has a dramatic effect and unambiguously argues that ets1 and ets2 are not redundant (7 ). Their specificity of action may stem from different features: differences in expression pattern, in cofactor recruitment or in transactivation characteristics. It is possible that when bound to the same DNA sequence in the same cell Ets1 and Ets2 have distinct transcriptional effects. Indeed, different TADs have distinct cell type and promoter element specificities (60 -62 ). Acquisition of unrelated TADs by different ETS genes thus may permit regulation of target genes with dissimilar promoter contexts in diverse cell types.
Also, one can notice that many transcription factors, including most of the ETS family members, possess several TADs. The acquisition of additional TADs within the same molecule may be beneficial, for these domains may cooperate to yield higher induction of target gene expression (63 ,64 ).
We have identified the functional core of ERM AD1 as an [alpha]-helix that is encoded by a single exon, ERM exon 4. The remaining part of the acidic domain, also highly conserved in ER81 and PEA3, positively regulates the activity of the helix and is also encoded by a single exon, exon 5 of ERM (41 ). The intron between exons 4 and 5 is only 128 bp long. As exons 3 and 6 of ERM have compatible reading frames and are separated by a particularly large intron, we view it as possible that the exon 4-intron-exon 5 block became inserted into the PEA3/ERM/ER81 progenitor by an exon shuffling mechanism. Notably, the exon 4-intron-exon 5 module is flanked by phase 1 introns, as are most of the modules building up mosaic proteins (65 ). No region homologous to the acidic domain is found in the closest PEA3 relatives, elk and GABP[alpha] (see 66 for a phylogeny of the ETS family), therefore the insertion event must have taken place after the progenitors of these genes split. We think that acquisition of AD1 by the PEA3 group members exemplifies the way ETS proteins have become increasingly diverse. This diversity, along with evolution of various promoter and enhancer elements, now allows them to regulate developmental events as distant as haematopoiesis (67 ) and skeletal development (68 ).
Many mechanisms of transcriptional activation have been documented. Some of them involve interactions between the AD and components of the basal transcription complex TFIID, comprising TBP and the TAFs (69 ). Here we show that AD1 contacts TAFII60. We think that this interaction is of functional relevance, for a mutation that renders AD1 inactive also abrogates its ability to target TAFII60. Other ADs contacting TAFII60 include that of Bicoid and p53 (70 ,71 ). Interestingly, the latter contains an [alpha]-helical stretch with critical aromatic residues (51 ) and therefore might be partially related to AD1 of ERM.
We found that ERM also binds TBP and TAFII40. These interactions are not abrogated by the F47P mutation that inactivates AD1. Thus we think they are carried out by different regions of the protein and, most likely, by AD2.
The ability of AD1 and AD2 to target different components of the pre-initiation complex could explain the synergistic effect they exert on transcription (27 ).
Dr Jay Gralla provided the 5*GAL4 Luc plasmid used in this work. We thank Guy Lippens, Patrick Martin, Hélène Pelczar, Anne Chotteau and other members of our group for discussions. Claire Montpellier, Isabelle Brun, Jean Philippe Basuyaux, Elisabeth Ferreira and Brigitte Quatannens offered friendly advice and Laurent Pouilly efficient assistance. J.H.Chen shared unpublished results. J.P.Levillain kindly initiated our collaboration with the Institut Gustave Roussy. We feel particularly indebted to Guillaume Adelmant for his invaluable support. This work was funded in part by grants from the Centre National de la Recherche Scientifique, the Association pour la Recherche sur le Cancer, the Ligue contre le Cancer, the FEGEFLUC and the Institut Pasteur de Lille.
REFERENCES
