DDBJ/EMBL/GenBank accession no. Y12670
ABSTRACT
The leptin receptor (OB-R) is a single membrane- spanning protein that mediates the weight-regulatory effects of leptin (OB protein). Several mRNA splice variants have been described which either encode OB-R proteins with cytoplasmic domains of different length or the OB-R and B219/OBR variants, which have different 5'-untranslated regions. Here we report evidence for the synthesis of a human mRNA splice variant of the OB-R gene that potentially encodes a novel protein, leptin receptor gene-related protein (OB-RGRP), which displays no sequence similarity to the leptin receptor itself. This OB-RGRP transcript contains the first two OB-R gene 5'-untranslated exons, but then is alternatively spliced to two novel exons which were mapped to a yeast artificial chromosome containing the leptin receptor gene. First identified by analysis of a large human expressed sequence tag database, the OB-RGRP transcript has now also been found in human and mouse tissues by the use of PCR. Preliminary experiments suggest that OB-RGRP and the OB-R variants share similar patterns of expression that are distinct from that of the B219/OBR variant. OB-RGRP is highly homologous to putative open reading frames in both yeast and Caenorhabditis elegans, suggesting a phylogenetically conserved role for this novel protein.
Leptin and the leptin receptor have recently been reported to play key roles in the regulation of body weight of rodents. The obese phenotype of ob/ob mice was thus shown to result from a single mutation in the ob gene (1 ), which codes for leptin. The leptin receptor (OB-R) is encoded by a gene found to be defective in obese db/db mice and in fa/fa Zucker and fak/fak Koletsky rats (2 -5 ). The OB-R is a single membrane-spanning receptor homologous to members of the class I cytokine receptor family (6 ,7 ). Two 5'-untranslated regions (5'-UTRs) and several 3'-alternative splice variants encoding OB-R with cytoplasmic domains of different length have been described in mouse, rat and human (2 ,3 ,6 ,8 -10 ). Two major isoforms, B219/OBR and OB-R, differ in their 5'-UTR and in their expression patterns (10 ).
A single transcription unit may serve to generate more than one protein. For instance, several isoforms can be derived from a single gene locus by alternative pre-mRNA splicing (11 ). The use of alternative promoters or polyadenylation sites may also generate proteins with different N- or C-terminal regions.
Leaky reading at the first AUG during initiation of translation has been described as another potential mechanism to generate different gene products. Initiation at the first or second AUG generates either long and short isoforms or unrelated proteins when the AUGs are in different, overlapping reading frames (12 ).
We report here that alternative splicing in the OB-R gene may generate either the OB-R transcripts or another transcript containing the 5'-UTR of OB-R in which an alternative AUG initiation codon starts a distinct open reading frame (ORF). This newly identified human and murine OB-R mRNA encodes a putative 14 kDa protein, named OB receptor gene-related protein (OB-RGRP), which is homologous to yeast and Caenorhabditis elegans putative ORFs. Genomic organization and cDNA comparison show that the OB-RGRP gene shares its promoter and two exons with the OB-R gene. The OB-RGRP amino acid sequence is, however, entirely different from that of OB-R.
The double alternative utilization of exons and promoter in this manner has not, until now, been reported for the mammalian genome. The fact that we have cloned similar cDNAs from a mouse library shows that this feature is conserved in humans and rodents. This may suggest that there is a requirement for a coordinate expression of OB-R and OB-RGRP to elicit the full physiological response to leptin in vivo.
Premade Northern blots were obtained from Clontech Laboratories Inc. and prehybridized at 42oC for 6 h in a hybridization cocktail containing 50% formamide, 5* SSPE, 10* Denhardt's solution, 2.0% (w/v) SDS and 100 [mu]g/ml sheared salmon sperm. The blots were hybridized with a [32P]dCTP-labelled DNA generated by PCR and corresponding to nucleotides 29-979 of huOB-RGRP (see Fig. 1 ) for 16 h. Northern blots were rinsed twice at room temperature with 2* SSC, 0.05% SDS and twice at 50oC for 20 min in 0.1* SSC, 0.1% SDS. Overnight autoradiography was performed using Biomax X-ray film (Kodak).
Total cellular RNA from HeLa cells, from a panel of hematopoietic cell lines and from immortalized brown adipocytes (13 ) were assayed by RT-PCR. For adipose cell line differentiation, cells were cultivated for 3 days in medium containing 6% fetal calf serum to obtain confluence, then cells were refed with ITT medium supplemented with 0.1 [mu]M dexamethasone, 850 nM insulin, 1 nM triiodothyronine, 1 [mu]M pioglitazone and IBMX (0.25 mM for 4 days) for 15-21 days. Reverse transcription was performed on 1 [mu]g mRNA with Superscripttm II reverse transcriptase (Gibco BRL) using random hexamers in a 50 [mu]l reaction. The primer sequences for P1-P4 are 5'-AAGGCCGCAGGCTCCCCCATT-3', 5'-AGCAGCCGCGGCCCCAGTTC-3', 5'-TGACAAGTTAAACGCAGTTATCACAT-3' and 5'-TCTCTGCCTTCGGTCGAGTTG-3' respectively. The concentrations of the four primers were as follows: P1, 500 nM; P2, 250 nM; P3, 500 nM; P4, 100 nM. The 50 [mu]l PCR reaction contained 10 [mu]l first-strand cDNA, 200 [mu]M each dNTP and 0.3 U Taq polymerase (Promega). The PCR profile was 94oC 3 min, 94oC 20 s; 62oC 30 s, 72oC 30 s for 34 cycles; 72oC 4 min for one cycle. To measure the ratio of the PCR products, quantification on ethidium bromide stained agarose gels was performed for several independent experiments using the Adobe Photoshop program.
Long template PCRs were performed using the Expandtm Long Template 2 PCR system (Boehringer Mannheim) with human placental DNA. Several PCRs were performed using sense or antisense primers for the four OB-RGRP exons.
PCR products were precipitated with 0.3 M NaCl and 2.5 vol ethanol, resuspended in water and directly sequenced with both primers. DNA sequencing was performed on an ABI 377 DNA sequencer using the Taq cycle sequencing kit (Applied Biosystems) and dye-terminator sequencing reactions.
The accelerating pace of high-throughput sequencing has led to the production of large expressed sequence tag (EST) databases, both public and proprietary, that are powerful resources for gene discovery. Direct sequence homology searching of EST databases has a significant advantage relative to primer-directed PCR cloning, as the results are inclusive rather than selective. We have used the human leptin receptor cDNA sequence to search both a private (Incyte Pharmaceuticals) and a public (Washington University/Merck) database for matching EST sequences.
A number of EST sequences were identified that exactly matched the published leptin receptor sequence from +12 to +173 (6 ) and then abruptly diverged in sequence. For human mRNA different RT-PCRs were performed to confirm the assembly of the EST sequence (data not shown). Assembly of the EST matching sequences and RT-PCR product sequences revealed a consensus alternative transcript (accession no. Y12670) which contains an AUG at nucleotides 71-73 flanked by Kozak consensus sequences. Translation from this initiation site would yield a polypeptide of 131 amino acid residues with a molecular mass of 14.255 Da (p14 OB-RGRP, leptin receptor gene-related protein or OB-RGRP) (Fig. 1 A). Surprisingly, the AUG of OB-RGRP is present in the 5'-UTR of the OB-R transcript. The putative translation from this AUG ends 4 nt upstream of the AUG of OB-R and yields a truncated OB-RGRP polypeptide of 36 amino acid residues, molecular mass 3.652 Da (Fig. 1 B). The predicted strength of these two AUG signals, calculated according to the score described in Lida et al.(14 ), showed that both scores are compatible with a strong initiation of translation (4.6 and 5.2 for the AUG of OB-RGRP and OB-R respectively).
Sense and antisense primers from the OB-RGRP cDNA were used to amplify genomic DNA from human placental DNA by long template PCR. The restriction map and partial sequence of the amplified product were determined to elucidate the genomic organization and the exon/intron boundaries of the OB-RGRP gene (Fig. 2 ). The sequence which is common to OB-RGRP and OB-R mRNAs corresponds to the first two exons of OB-R (Thompson, D.B., Ossowski,V., Sutherland,J., Apel,W. and Biesterdfeld,J., accession numbers U59246-U59248). The second exon of the OB-R gene may be spliced alternatively to generate either OB-RGRP or the OB-R transcripts. Long template PCR, effective for a 10 kb amplification, fails to amplify the DNA between the OB-RGRP exons and either OB-R exon 3 or B219/OBR exon 1, suggesting that the distance between the OB-RGRP and downstream OB-R exons is probably >10 kb. This is not unexpected, since the OB-R gene has been shown to span >100 kb (4 ).
The tissue distribution of OB-RGRP was analysed on poly(A)+ mRNA Northern blots of several human tissues. The OB-RGRP mRNA appears as a band between the 1.3 and 2.4 kb markers (Fig. 3 ) and is detected in heart, placenta, lung, liver, skeletal muscle, kidney and pancreas. Heart and placenta express OB-RGRP at the highest levels, whereas brain and kidney express it at the lowest levels. These data suggest that expression of OB-RGRP is relatively widespread.
The sequence alignment of mouse and human OB-RGRP (Fig. 5 ) reveals differences at six positions, 42, 43, 49, 87, 92 and 119. Two of these are conserved substitutions.
Searches for amino acid sequence similarity (BLASTP and TFASTA programs) between translated OB-RGRP and various databases yielded no matches in the primate, rodent or vertebrate protein databases except for human and rat OB-R 5'-UTRs (6 ,9 ). However, significant homologies were observed with putative ORFs identified in C.elegans (18 ) and Saccharomyces cerevisiae (Fig. 5 ). The best match was found with C.elegans C30B5.2 and extends over nearly the entire length of both predicted proteins. Two domains are highly homologous. Domain 1 starts from amino acid +9 of OB-RGRP to +27 (corresponding to +24 to +42 of C30B5.2), with 14/19 identical residues (74% identity) and three non-polar conserved residues, yielding an overall homology of 90% for this domain, which is highly hydrophobic and contains no charged amino acids. Domain 2, from residues +65 to + 88 of OB-RGRP, contains 17/25 identical and five non-polar conserved amino acids yielding a combined homology of 92%. OB-RGRP also has a significant but weaker match with yeast ORF YJR 044c. Interestingly, all three proteins begin and end at approximately the same positions.
We describe here the translation of two putative unrelated proteins from alternatively spliced mRNAs transcribed under the control of the same promoter. One of these proteins is the leptin receptor. The other is a new 131 residue protein, OB-RGRP, found in both man and mouse. Only six residues distinguish OB-RGRP in the two species. This protein is quite homologous to putative ORFs in yeast and C.elegans. Two domains of ~20 residues are each 90% homologous between huOB-RGRP and C.elegans C30B5.2. The predicted ORF of the C30B5.2 gene is split by two small introns (17 ) and the relative position of these introns is identical in the huOB-RGRP gene. These data strongly suggest that the OB-RGRP cDNA encodes a protein well conserved in mammals and related to an ancestral gene retained in an invertebrate and in a lower eukaryote.
The genomic organization of the 5'-part of the OB-R gene and the exon/intron sequences are in agreement with transcription in the same direction of the first two exons to yield OB-RGRP and OB-R mRNA (Figs 1 and 2 ). As a common 3'-alternative splicing mechanism (11 ), the polyadenylation signal recognition in OB-RGRP exon 4 may leak and introns 2 and 3 may remain unspliced to trigger transcription through the OB-R gene and maturation of the OB-R transcript. Separation of the strong pyrimidine tract associated with the branch site from the acceptor splice site of OB-RGRP exon 3 (Fig. 2 ) fits with a difficult to excise intron 2, as already described in the sequence upstream of exon 3 of [alpha]-tropomyosin and exon 7 of [beta]-tropomyosin, in which a negative regulatory element lies just upstream of the acceptor splice site (18 ,19 ).
Formation of the B219/OB-R transcript is likely to be due to the use of an alternative promoter, as proposed earlier (10 ). OB-RGRP, OB-R and B219/OB-R expression, examined by RT-PCR, supports such a genomic organization. Indeed, in the cells we examined the percentage of immature transcripts initiated from a single promoter from either OB-RGRP or OB-R is nearly constant, as shown by only slight variation in the ratio of specific PCR products (Fig. 3 ). This suggests that there is no strong post-transcriptional regulation. In contrast, the B219/OB-R transcript shows a distinct pattern of expression. Unlike OB-RGRP and OB-R, B219/OB-R is expressed in haematopoietic K562 cells (10 ) and is induced in PAZ-6 adipocytes as they differentiate, suggesting different transcriptional regulation of the B219/OB-R and OB-R promoters.
There are few examples of dual utilization of a single promoter generating unrelated proteins in eukaryotic genes. Overlapping genes have been shown, however, to be controlled by a common promoter with transcription in opposite directions, as occurs for example at the complex mouse surfeit locus, in which the surf1 and surf2 genes are transcribed in opposite directions from a common 73 bp promoter (20 ). Transcription of genes may diverge by specific use of initiation sites of transcription in opposite directions (21 ). Another example is provided by the calcitonin/calcitonin gene-related peptide (CGRP) locus, which, by alternative splicing and polyadenylation (22 ,23 ) followed by N-terminal proteolytic cleavage, yields two unrelated products with different functions. In this case the promoter, the first three exons and the initiation codon are identical and tissue-specific post-transcriptional regulation occurs to yield expression of these proteins in different cell types (23 ). Recently, alternative utilization of two promoters and two reading frames within the second exon has been described for the INK4a gene (24 ). This process yields two polypeptides that are entirely different in their amino acid sequences. The resulting proteins nevertheless have similar biological functions in cell growth arrest in mammalian fibroblasts. Concerning OB-RGRP and OB-R transcripts, classic alternative splicing and polyadenylation generate two transcripts with different ORFs. The initiation codon for the second protein is present in the 5'-UTR of the alternative transcript.
Such a genomic organization, allowing transcription of both OB-RGRP and OB-R mRNA from the same promoter with little or no cell-specific post-transcriptional regulation may yield coordinated OB-R and OB-RGRP synthesis. Indeed, OB-R may be synthesized from the transcript containing two AUGs by leaky scanning of the first AUG, as observed in several virus genes and in cDNAs with leader sequences with several AUGs (12 ). The selective pressure to conserve this unusual overlapping gene organization in mouse, human and probably rat suggests a functional importance for interdependent regulation of expression of these genes. Coordinated expression of OB-RGRP and OB-R may be necessary to maintain a constant basal expression of the OB-RGRP and OB-R proteins. Indeed, Northern blot analysis of various human tissues (Fig. 3 ) reveals that expression of OB-RGRP is as widespread as that of OB-R (10 ). In addition, the B219/OB-R transcript may allow cell- or differentiation-specific variation of the OB-RGRP and OB-R protein expression ratios.
In the OB-RGRP protein sequence several stretches of hydrophobic residues suggest the possible presence of transmembrane domains. Proximal to the transmembrane domain of several members of the cytokine receptor family, including the leptin receptor (6 ), one finds a Pro-X-Pro sequence preceded by a cluster of hydrophobic residues, called box 1 (25 ,26 ). Substitution of the two Pro by Ser residues results in loss of tyrosine phosphorylation of JAK2 induced by the activated receptor (25 ). In the OB-RGRP protein a similar box 1 (Pro46-Ile-Pro48) is observed which is conserved in the various species studied so far (Fig. 5 ). It is noteworthy that the full-length leptin receptor has been shown to modulate the JAK/STAT pathway, as do the interleukin 6-type cytokine receptors (27 ), whereas the short form expressed in db/db mice is unable to activate this signalling pathway (28 ). Box 1 is present in both OB-R forms, suggesting that this motif may not be sufficient for JAK/STAT activation. However, under leptin stimulation both OB-R forms are able to induce mRNA expression of immediate early genes (29 ).
It has been shown recently that leptin can homodimerize the OB-R receptor extracellular domains (30 ,31 ). It is tempting to suggest that OB-RGRP could encode an accessory protein, involved in leptin signalling.
We thank Drs Tarik Issad (Paris), Cindy Gerhardt (Paris), Mark Plumb (Oxford, UK) and Gary Zweiger (Incyte Pharmaceuticals) for critical discussions and Dominique Part for technical help. This work was mainly supported by the Centre National de la Recherche Scientifique, l'Institut National de la Santé et de la Recherche Médicale and the Ministry for Science, Education and Research. We also thank the Ligue Nationale contre le Cancer, the Fondation pour la Recherche Médicale Française, the Association pour la Recherche contre le Cancer and the European Union (Human Capital and Mobility contracts MIEC CHRX-CT 94-0490 and ENBST CHRX-CT 94-0689) for financial support.
*To whom correspondence should be addressed at: Institut Cochin de Génétique Moléculaire, Laboratoire d'ImmunoPharmacologie Moléculaire, CNRS UPR 0415, 22 rue Méchain, 75014 Paris, France. Tel: +33 1 40 51 64 08; Fax: +33 1 40 51 72 10; Email: bailleul@icgm.cochin.inserm.fr
Reverse transcription-PCR (RT-PCR)
REFERENCES


