Use of an engineered ribozyme to produce a circular human exon
Use of an engineered ribozyme to produce a circular human exonSvetlana Mikheeva, Mariam Hakim-Zargar, Daphne Carlson and Kevin Jarrell*
Department of Pharmacology and Experimental Therapeutics, Boston University Medical Center, 80 East Concord Street, Boston, MA 02118, USA
Received July 11, 1997;Revised and Accepted November 3, 1997
ABSTRACT
We report the use of an engineered ribozyme to produce a circular human exon in vitro. Specifically, we have designed a derivative of a yeast self-splicing group II intron that is able to catalyze the formation of a circular exon encoding the first kringle domain (K1) of the human tissue plasminogen activator protein. We show that the circular K1 exon is formed with high fidelity in vitro. Furthermore, the system is designed such that the circular exon that is produced consists entirely of human exon sequence. Thus, our results demonstrate that all yeast exon sequences are dispensable for group II intron catalyzed inverse splicing. This is the first demonstration that an engineered ribozyme can be used to create a circular exon containing only human sequences, linked together at a precise desired ligation point. We expect these results to be generalizable, so that similar ribozymes can be designed to precisely create circular derivatives of any nucleotide sequence.
Self-splicing group II introns catalyze various splicing reactions in vitro. These reactions are shown in Figure 1 . Figure 1 A depicts a standard cis-splicing reaction in which a group II intron [intervening sequence (IVS) 1-6], flanked by a 5' exon (E5) and a 3' exon (E3), excises itself from a transcript by a two-step transesterification reaction (1 ,2 ). Figure 1 B depicts a trans-splicing reaction (3 ), in which an E5 attached to part of a group II intron (IVS 1-3) joins with an E3 attached to another part of the intron (IVS 5,6). Figure 1 C depicts an inverse splicing reaction, in which a downstream E5 is spliced to an upstream E3, so that a circular exon molecule and a Y-branched intron, IVS(Y), molecule are produced (4 ). Note that the RNA shown in line 1 of Figure 1 C can participate both in the depicted unimolecular inverse splicing reaction and in bi-molecular trans-splicing reactions that create chains of E3E5 concatemers, flanked by group II intron sequences (not shown). When inverse splicing templates are incubated under splicing conditions, both inverse splicing and trans-splicing products are observed (unpublished results). In addition, the intron is known to also catalyze two particular hydrolysis reactions (5 ) (not shown). In the first hydrolysis reaction, step one of splicing occurs by hydrolysis rather than by transesterification; the products of that hydrolysis reaction are (IVS 5,6)E3,E5 plus IVS 1-3. In the second hydrolysis reaction, which is referred to as `spliced exon reopening' (SER), IVS(Y) cleaves the excised exon circle precisely at the splice point. SER yields a linear version of the excised exons (E3,E5).
We utilized four different plasmid constructs in this study: pJD20, pY1, pY8 and pY9. The pJD20 plasmid encodes the full-length aI5[gamma] intron with its flanking 5' and 3' exons (3 ). The pY1 plasmid is also known as pINV1 and encodes an inverse splicing substrate that splices to yield an exon circle and a Y-branched intron (4 ). Plasmids pY8 and pY9 encode inverse splicing substrates with exons that are comprised entirely of human exon sequences. Plasmid pY8 was constructed in three steps. First, single-stranded pY1 DNA was used as a template for site-directed mutagenesis; the site-directed mutagenesis experiment was conducted as previously described (3 ). Three primers were used simultaneously to: insert a KpnI site immediately upstream of E3; change the six nucleotide sequence that is immediately 5' of IBS1 from 5'-GTGGGA to 5'-CTCGAG (a XhoI site); and change the EBS1 site from 5'-GGAAATG to 5'-TCCCTCA; the resulting plasmid is referred to as pK.Y.X(EBS1K1). Next, the K1.Kpn and K1.Xho primers (5'-ACGGGTACCACCAGGGCCACGTGCTACGAGG and 5'-GTAGCAGTCACTGTTCTCGAGTCCCTCAGAGCAGGC, respectively) were used in PCR to amplify the region of pKS+t-PA (13 ) that encodes the first kringle domain (K1); the enzymes KpnI and XhoI were then used to construct a new plasmid in which the yeast exon sequences of pK.Y.X(EBS1K1) were replaced with the K1 PCR product. The resulting plasmid is referred to as K.K1.X(EBS1K1). Finally, single-stranded DNA was isolated from K.K1.X(EBS1K1) and site-directed mutagenesis was used to delete the KpnI and XhoI sites and simultaneously to precisely fuse intron domains 5 and 6 to the first base of the K1 exon, and to precisely fuse the last base of the K1 exon to intron domains 1-3; the resulting plasmid is referred to as pY8. To construct pY9, single-stranded DNA was isolated from pY8 and site-directed mutagenesis was used to change the EBS2 sequence from 5'-CACCAC to 5'-CAGGCA; the resulting plasmid is referred to as PY9.
Plasmids pJD20, pY1, pY8 and pY9 were cut with HindIII and transcribed in vitro with T7 RNA polymerase (Pharmacia or Stratagene) at 40°C. Transcription of pJD20 yields the WT RNA; while transcription of pY1, pY8 and pY9 yields the PY1, PY8 and PY9 RNAs, respectively. Unless otherwise stated, WT and PY1 were labeled by incorporation of [[alpha]-32P]UTP (600 Ci/mmol; New England Nuclear) while PY8 and PY9 were labeled by incorporation of [[alpha]-32P]GTP (600 Ci/mmol; New England Nuclear). Standard 100 µl transcription reactions contained 30 µCi of labeled UTP, 0.4 mM unlabeled UTP and 0.5 mM unlabeled CTP, GTP and ATP; or alternatively 30 µCi of labeled GTP, 0.4 mM unlabeled GTP and 0.5 mM unlabeled CTP, UTP and ATP.
Prior to splicing, full-length RNA transcripts were purified from an acrylamide gel. The splicing reactions were carried out at 45°C in buffers of 40 mM Tris-HCl (pH 7.6), 100 mM MgCl2, along with either 1.5 M (NH4)2SO4, 1.5 M NH4Cl, 1.5 M NaCl or 1.5 M KCl.
The reaction rates (Table 1 ) were determined by quantitation of time course data that were similar to the data shown in Figure 4 . Radiolabeled RNA samples were fractionated on a 4% acrylamide gel. Data were collected from the dried gel using the phosphorimager. For each time point, the amount of radioactive precursor, and the amount of each major radioactive product, was measured. The product fraction was calculated by dividing the amount of product by the total amount of radioactive material (i.e., precursor plus products). Each time course experiment was performed at least three times. The data were analyzed using KaleidaGraph (Abelbeck Software) with both equation 1 (18 ):
where Fact, fraction active; k, rate coefficient; FA, fraction of RNA in population A; kA, rate coefficient for population A; FB, fraction of RNA in population B; kB, rate coefficient for population B.
Unlabeled RNA was made by in vitro transcription in the absence of radiolabeled nucleotide triphosphates. Unincorporated nucleotides were removed using a G-25 spun column, followed by ethanol precipitation. RNA was dissolved in 25 µl 1× calf intestinal phosphatase buffer (Boehringer Mannheim) and treated with 1 U phosphatase (Boehringer Mannheim) for 30 min at 37°C. The enzyme was removed by phenol/chloroform extraction and the RNA was concentrated by precipitation with ethanol. RNA was dissolved in 1× one-phor-all buffer (Pharmacia) and treated with polynucleotide kinase (Pharmacia) in the presence of 50 µCi [[gamma]-32P]ATP (3000 Ci/mmol; New England Nuclear); the mixture was incubated for 2 h at 37°C. Unincorporated nucleotides were removed using a G-25 spun column, followed by ethanol precipitation. The labeled RNA was purified from an acrylamide gel.
Unlabeled RNA was made and purified as described above. After precipitation, the RNA was dissolved in 20 µl of buffer that contained 50 mM Tris-HCl (pH 7.6), 15 mM MgCl2, 3.3 mM DTT, 50 µM ATP, 10% DMSO, 100 µCi [5'-32P]pCp (3000 Ci/mmol; New England Nuclear) and 14 U T4 RNA ligase (Pharmacia); the mixture was incubated for 10 h at 4°C. Unincorporated nucleotides were removed using a G-25 spun column, followed by ethanol precipitation. The labeled RNA was purified from an acrylamide gel.
The anti-sense probe that was used for RNAase protection was generated by in vitro transcription of plasmid pK1-ANTI. To generate the pK1-ANTI plasmid, a 168 bp region of pY9 was amplified by PCR using the I5-29 primer (5'-TATTATTTATGATAACTTTCAGACC-3') and the K1.Cir.2 primer (5'-GGCCAGACGCCATCAGGCTG-3'). The product was cloned into the pCR 2.1 vector (Invitrogen). A plasmid with the insert oriented such that transcription with T7 polymerase yields the antisense RNA was identified; that plasmid is referred to as pK1-ANTI. To generate the radiolabeled probe, pK1-ANTI was cut with SpeI and labeled RNA was made by random incorporation of [[alpha]-32P]UTP during the in vitro transcription reaction. The RNAase protection experiment was performed as described elsewhere (20 ).
A reverse transcription/polymerase chain reaction experiment was done to characterize the circular K1 exon. Reverse transcription was primed with the K1.Cir.1 primer (5'-GCCAACGCGCTGCTGTTCCAG-3') and PCR was primer with both the K1.Cir.1 and the K1.Cir.2 (5'-GGCCAGACGCCATCAGGCTG-3') primers.
DNA sequencing was done using sequenase (US Biochemicals) according to the manufacturer's instructions. The other experiments were conducted as described elsewhere: debranching (21 ), primer extension (20 ).
We constructed a plasmid (pY8, see Materials and Methods) from which an RNA (PY8) comprised of the K1 exon flanked by group II intron sequences (Fig. 2 A, line 3) could be produced by in vitro transcription with T7 RNA polymerase. In the pY8 plasmid, the yeast exon sequences were precisely replaced by human exon sequences (i.e., with the K1 exon). In addition, the EBS1 sequence of the intron was mutated to make it complementary to the IBS1 sequence of the inserted K1 exon. We also constructed the plasmid pY9; which encodes the PY9 RNA (Fig. 2 A, line 4). PY9 is identical to PY8 except that the EBS2 sequence of the PY9 intron was mutated to make it complementary to the IBS2 sequence of the K1 exon.
The effects of several different reaction conditions on the competing reactions-inverse splicing, trans-splicing and hydrolysis-of PY8 and PY9 were analyzed. It is known that in vitro splicing by group II introns requires magnesium (1 ). Studies have shown that cis-splicing occurs in 100 mM MgCl2, indicating that this buffer provides sufficient salts for reaction (5 ). Cis-splicing is stimulated, however, by the addition of salts of monovalent cations to the 100 mM MgCl2 buffer. Such monovalent cations are therefore typically employed in group II intron splicing reactions.
As shown in Figure 3 A, each substrate was incubated in 100 mM MgCl2 buffer containing either (i) no monovalent cations (lanes 8 and 14); (ii) 1.5 M KCl (lanes 12 and 18); (iii) 1.5 M (NH4)2SO4 (lanes 9 and 15); (iv) 1.5 M NH4Cl (lanes 10 and 16); or (v) 1.5 M NaCl (lanes 11 and 17). The products that were observed are depicted schematically in Figure 3 B and are indicated next to lane 18 in Figure 3 A. Figure 3 A also includes control reactions that show the products obtained in cis-splicing (lanes 1-3) and inverse splicing (lanes 4-6) reactions using the wild-type yeast constructs.
Time course experiments were performed to identify the most useful reaction conditions for production of circular exons. Representative data from these time course experiments are presented in Figure 4 A [PY9 in 1.5 M (NH4)2SO4] and Figure 4 B (PY9 in 1.5 M NH4Cl); the results are summarized in Table 1 .
Two different methods have been reported for calculating the rates of group II intron splicing reactions. The first method assumes that only a fraction of the precursor is active (18 ,24 ). According to this method, time course data are analyzed using equation 1 (see Materials and Methods), and two parameters are determined: (i) the fraction of active molecules (Fact); and (ii) the rate coefficient (k). The second method assumes that two populations of precursor RNA molecules are present, a fast reacting and a slow reacting population (19 ). In this method, the time course data are analyzed using equation 2 (see Materials and Methods), and four parameters are determined: (i) FA, fraction of molecules in the fast population; (ii) FB, fraction of molecules in the slow population; (iii) kA, rate coefficient for population A; and (iv) kB, rate coefficient for population B. We analyzed our data using both of these methods and report the complete set of calculated values (Table 1 ). We find that our data are best fit to equation 2, and therefore have based our conclusions on the values that were calculated using the second method.
The rate data reported in Table 1 indicate that NH4Cl reaction conditions are likely to be most useful for production of large quantities of circular exons. Although the circular exon is most abundant when PY9 or PY8 is incubated in (NH4)2SO4 buffer, both precursors splice slowly in that buffer. Thus, useful amounts of circular exons can be generated most rapidly using the NH4Cl buffer.
Under all three reaction conditions, PY8 and PY9 splice significantly more slowly than does the WT cis-splicing RNA. Relative to the WT RNA, FA, kA and kB are all reduced. By contrast, the yeast inverse-splicing RNA, PY1, and the WT RNA splice with similar kinetics. We are currently investigating the reason why the PY8 and PY9 RNAs splice slowly.
The reaction kinetics for PY8 and PY9 under KCl conditions are similar to those observed under NH4Cl conditions but, as discussed above, incubation in KCl buffer yields very little Y-branched intron and circular exon.
The data presented in Table 1 reveal an additional interesting aspect of the inverse splicing reactions we are studying: the strength of the EBS2-IBS2 interaction appears to have little or no effect on the efficiency of inverse splicing. That is, both PY8, which has a weak -2.3 kcal) EBS2-IBS2 pairing, and PY9, which has a strong -12.2 kcal) pairing, splice at about the same rate under all of the reaction conditions that were tested (Table 1 ). These results are consistent with previous studies that showed that the EBS1-IBS1 interaction is essential for cis-splicing; while the EBS2-IBS2 interaction is important, but not essential, for cis-splicing (25 ).
The validity of these observations and conclusions, of course, depends on the accuracy of identification of the reaction intermediates and products. We confirmed the identities using several different approaches, described below.
As a preliminary matter, note that radiolabeled splicing substrates of the aI5[gamma] group II intron are typically produced by random incorporation of [[alpha]-32P]UTP during in vitro transcription. Since intron aI5[gamma] and its natural flanking exons are relatively A+T rich, [[alpha]-32P]UTP-labeling effectively labels both the intron and the exon portions of the precursor, so that all splicing products are readily detectable after the reaction. The human K1 exon, by contrast, is very GC rich. Accordingly, K1-containing splicing precursors labeled with randomly incorporated [[alpha]-32P]UTP are only weakly labeled in their exon sequences; and exon-containing intermediates and products are barely detectable (see, for example, Fig. 5 A, lane 4). Therefore, in order to efficiently detect intermediates and products, PY8 and PY9 were labeled by random incorporation of [[alpha]-32P]GTP. When either PY8 (data not shown) or PY9 (Fig. 5 A, lane 2) is labeled with GTP, the exon containing RNAs are readily detected.
Figure 5. Product characterization. (A) Splicing of 5' versus 3' end-labeled PY9. Each sample contained 200 000 c.p.m. of radiolabeled PY9 RNA. The first pair of samples (lanes 1 and 2) was randomly labeled with [[alpha]-32P]GTP, the second pair was randomly labeled with [[alpha]-32P]UTP, the third pair was 5' end labeled, and the fourth pair was 3' end labeled. The odd numbered samples were not incubated prior to fractionation. Even numbered samples were incubated for 4 h, at 45°C, in 1.5 M (NH4)2SO4 buffer prior to fractionation. (B) IVS(Y) from PY1, PY8 and PY9 is sensitive to debranching. The five RNAs analyzed in this experiment were: IVS(Y) from PY1 (lanes 1 and 2), IVS(Y) from PY8 (lanes 3 and 4), IVS(Y) from PY9 (lanes 5 and 6), K1d(IVS1-3) (lanes 7 and 8), IVS 1-3 (lanes 9 and 10). Prior to the analysis, each of the five RNAs was purified by excision from a polyacrylamide gel. Each sample contained 20 000 c.p.m. of randomly labeled RNA. The experimental samples (even numbered lanes) were treated with HeLa debranching activity prior to fractionation; the control samples (odd numbered lanes) were not treated with the debranching activity. The upper panel shows the region of the gel that contained molecules ~700 nt in length; the lower panel shows the region of the gel that contained molecules ~90 nt in length. (C) RNAase protection mapping. Probe synthesis: the region of the pY9 plasmid that contains the last 140 bp of the K1 exon plus the first 28 bp of the intron was subcloned, in an antisense orientation, downstream of a T7 promoter. Radiolabeled antisense RNA, transcribed from that plasmid, was used in an RNAase protection experiment with: (IVS 5,6)K1, lane 2; (IVS 5,6)K1u, lane 3; K1(C), lane 4; K1, lane 5. In the control sample (lane 1) the antisense probe alone was treated with RNAase. (D) Primer extension mapping of the 5' end of K1d(IVS 1-3). A DNA oligonucleotide complementary to nucleotides 5-29 of the intron was annealed to gel purified K1d(IVS 1-3) and the primer was extended using reverse transcriptase. The extension product (lane 2) was fractionated alongside a DNA sequencing ladder (lane 1).
In order to identify intermediates and products that contain the original 5' end of the splicing precursor, PY8 (data not shown) and PY9 (Fig. 5 A, lane 6) were uniquely labeled at their 5' ends prior to splicing. As shown in Figure 5 A, lane 6, three products of the splicing reaction were detected after this procedure. Further characterization, described below, showed that these RNAs represented the products depicted in lines 2, 5 and 6 of Figure 3 B.
In order to identify intermediates and products that contain the original 3' end of the splicing construct, PY8 (data not shown) and PY9 (Fig. 5 A, lane 8) were uniquely labeled at their 3' ends prior to splicing. As shown in Figure 5 A, lane 8, three products of the splicing reaction were detectable after this procedure. Further characterization, described below, showed that these RNAs represent the products depicted in lines 2, 3, and 4 of Figure 3 B.
On the basis of the end-labeling experiments, we inferred that the RNA labeled `IVS(Y)' is the branched intron (see line 2, Fig. 3 B) because only that product contained both the 5' and the 3' ends of the splicing precursor. Several additional observations confirmed this designation. First, that particular RNA exhibits the same electrophoretic mobility as the previously-characterized branched intron product of PY1 splicing [compare lanes 6 (PY1) and 9 (PY9) in Fig. 3 A]. Second, the RNA was purified from an acrylamide gel and subjected it to debranching (26 ). As a control, IVS(Y) from PY1 was subjected to the same treatment. Debranching of either the putative PY8 or PY9 IVS(Y) produced one product band that migrated with PY1 IVS 1-3 and one that migrated with PY1 IVS 5,6 [see lanes 2 (PY1), 4 (PY8) and 6 (PY9) of Fig. 5 B]. These findings confirmed that the product referred to as IVS(Y) is, in fact, the branched intron.
On the basis of the end-labeling experiments we also inferred that the RNA labeled `IVS 1-3' (see line 4 of Fig. 3 B) is comprised of intron domains 1-3. It contained only the 3' end of the original splicing precursor and was the expected size of the first three intron domains. This designation was confirmed by showing that the relevant band was not sensitive to debranching [compare lanes 9 (untreated IVS 1-3) and 10 (debranched IVS 1-3) of Fig. 5 B], and migrated with IVS 1-3 that was produced by debranching IVS(Y) [see lanes 4 (debranched PY8 IVS(Y)) and 6 (debranched PY9 IVS(Y)) of Fig. 5 B].
The end-labeling experiments further allowed us to make preliminary identifications of those bands containing exon-only products [i.e., K1(C) and K1; depicted in lines 7 and 8 of Fig. 3 B] because such products contain neither the 5' nor the 3' end of the original splicing substrate and therefore were not detected in either of the end-labeling experiments. The preliminary designations were confirmed by RNAse protection and RT/PCR experiments. The antisense probe used for the RNAse protection experiments was complementary to the last 140 nt of the K1 exon and to the first 28 nt of intron domain 1. As expected, this probe protected a 140 nt region of the putative K1(C) and K1 RNAs (Fig. 5 C, lanes 4 and 5, respectively). This finding confirmed that these bands each contained at least the last 140 nt of the K1 exon, and also revealed that neither product contained any intron domain I sequences. The product characterization was completed by performing RT/PCR with primers that were designed to amplify the splice junction in K1(C). Such primers only yield an amplification product from a circular template because, when they hybridize to a linear template (e.g., K1), they face away from one another and cannot amplify sequences that lie between them. RT/PCR analysis of the putative K1(C) band from either PY8 (Fig. 6 A, lane 5) or PY9 (Fig. 6 A, lane 4) produced a 241 bp amplified product, confirming that these bands were circularized exon. The amplified products were cloned into plasmid vectors and sequenced over the splice junction. Eight independently-isolated PY8 clones, and 10 independently-isolated PY9 clones were sequenced; in every case, the expected splice point sequence (5'-CCT/TGG) was observed (Fig. 6 B).
Figure 6. Analysis of the K1(C) ligation point. (A) Four RNAs were gel purified and analyzed by RT/PCR: (IVS 5,6)K1, lane 2; (IVS 5,6)K1u, lane 3; K1(C) from PY9, lane 4; and K1(C) from PY8, lane 5. The PCR products and a size standard (lane 1) were fractionated on a 1% agarose gel. The arrow designates the amplified product. (B) DNA sequence of the K1(C) ligation point.
The end-labeling experiments also allowed us to make a preliminary identification of the (IVS 5,6)K1 band. This identification was confirmed by noting that (i) the above-described RNAse A probe protected the same 140 nt fragment of this product as it did of the K1(C) and K1 RNAs (see Fig. 5 C, lane 2); and (ii) no product was produced when this RNA was subjected to RT/PCR with the primers described above (Fig. 6 A, lane 2). These findings, in combination with the observed size of the RNA, confirmed our designation of this band as intron domains 5 and 6 linked to the K1 exon (see line 5 of Fig. 3 B).
The end-labeling experiments did not determine the identity of the two unexpected products, now labeled K1d(IVS 1-3) and (IVS 5,6)K1u (depicted in lines 3 and 6 of Fig. 3 B), although they did reveal that each product contained one end of the splicing precursor. We noted that the measured sizes of the two products [760 nt for K1d(IVS 1-3) and 300 nt for and (IVS 5,6)K1u], when summed together, approximately equaled the length of the precursor RNA (1070 nt). That observation suggested that the two unexpected products resulted from a single cryptic cleavage of the precursor. To confirm this hypothesis, and to map the cleavage point, the RNAs were subjected to each of the characterizations described above. Note, for example, that the K1d(IVS 1-3) band was not sensitive to debranching [compare lanes 7 (untreated) and 8 (debranched) of Fig. 5 B]. Furthermore, the above-described RNAse A probe protected a 96 nt region of the (IVS 5,6)K1u band (Fig. 5 C, lane 3). This finding suggested that the cleavage event that produced the K1d(IVS 1-3) and (IVS 5,6)K1u might represent an aberrant cleavage reaction utilizing a sequence within the K1 exon (5'-CGGGGA) that is fortuitously complementary to the EBS1 site of both the PY8 and PY9 introns. This hypothesis was confirmed by using primer extension to map the 5' end of K1d(IVS 1-3). The primer we employed was complementary to nucleotides 5-29 of the intron. We found that extension of this primer against the K1d(IVS 1-3) RNA produced a 77 nt extension product, as expected if K1d(IVS 1-3) is the downstream product produced by cleavage at the cryptic IBS 1 site. We therefore concluded that K1d(IVS 1-3) and (IVS 5,6)K1u were the downstream and upstream products, respectively, of the proposed cleavage event. Consistent with this, no product was produced when (IVS 5,6)K1u was used as the template in an RT/PCR reaction involving the above-discussed primers.
In this report we show that an engineered group II intron can be used to accurately generate a circular human exon. That circular exon has the precise desired ligation point and it has no added exogenous (i.e., non-human) sequences. The ability to accurately produce circular RNAs of any sequence should be of utility for studies of the biological role of circular RNAs. Furthermore, this system can be used to generate RNAs that, due to their circular topology, are nuclease resistant (9 ,27 ).
The present study is not the first description of the use of a self-splicing intron to generate a circular RNA molecule. For example, a group I intron has been utilized to produce a circular RNA that contains a TAR RNA decoy (9 ). The system described here has several advantages over this group I intron system, however. First, the described group I system did not produce a circular molecule containing only the TAR sequences. Instead, the TAR RNA was inserted within the Anabena exon that is naturally associated with the intron, so that the product circular RNA contained both Anabena and TAR sequences (9 ). In all experiments that have utilized the group I intron to produce a circular RNA, the product circular RNA has included heterologous (e.g., Anabena) sequences (9 ,10 ,28 -30 ).
Furthermore, splicing of group I introns is very inefficient, or is abolished, when the 5' exon ends with any residue other than uracil (31 -34 ). Thus, even if it were possible to engineer a group I intron system that would produce circular exons with no heterologous sequences (for example using site-directed mutagenesis to precisely insert the desired exon between flanking intron sequences, and by engineering the P1 helix to function in the context of a new exon sequence), it would probably only be possible to use group I introns to circularize exons ending with uracil. By contrast, for group II introns, both phylogenetic studies (35 ,36 ) and our ribozyme-engineering studies (13 ) suggest that any RNA sequence should be precisely circularizable, regardless of the base present at the end of the 5' exon.
In addition to these self-splicing systems, both the yeast and human nuclear pre-mRNA splicing systems are capable of directing exon circle formation by inverse splicing in vitro (37 ,38 ), and apparently also in vivo (6 ,8 ). However, these systems are not expected to be useful for the sort of engineered circle formation described in this paper as there is no evidence that the spliceosome can easily be modified to allow precise circularization of desired sequences. In fact, given that certain RNA sequences are known to block cis-splicing of pre-mRNA templates (39 ) it is likely that exons containing these sequences will not be circularizable through spliceosome-directed inverse splicing. Of course, this expectation reflects an assumption that cis-splicing and inverse splicing utilize at least some of the same spliceosomal machinery. There is currently no evidence to support or refute this assumption but, from a purely theoretical standpoint it seems unlikely that the two reactions are mechanistically unrelated.
One additional difficulty that might be encountered in attempts to use pre-mRNA splicing systems to produce circular RNA is that the spliceosome is known to sometimes utilize cryptic 5' or 3' splice sites during cis splicing. If the same machinery is utilized in inverse splicing, it is unlikely that any desired RNA will be circularizable with absolute fidelity.
In light of the above discussion, it is clear that the present report describes a uniquely useful system for preparing circular RNA molecules in vitro. We are currently investigating whether the group II intron can also be utilized to efficiently produce circular RNAs in vivo. As mentioned above, such RNAs might act as stable translatable templates, or alternatively, might participate either as ribozymes or as antisense RNAs, in regulating splicing or other cellular events (such as transcription or translation). The ability to produce particular circular RNAs in vivo might therefore provide a new tool for gene therapy and/or antisense pharmaceutical studies.
The present study also provides certain mechanistic information about the group II intron inverse splicing reaction, and points up avenues for future research. Overall, the present data demonstrate that the group II intron will precisely circularize a selected RNA sequence even though that sequence bears no relationship to the sequences with which the intron is naturally associated. However, although we found that circularization by inverse splicing proceeds as efficiently as cis-splicing when the exon being utilized is the wild-type exon normally associated with the intron (19 ; Table 1 ), the reaction was significantly slower when the exon was totally foreign to the intron. It is possible that at least part of the observed difference in splicing efficiency reflects different compositions of the exons rather than any feature directly related to the exon being foreign to the intron. The wild-type yeast exon used in PY1 is 594 nt long [note that, due to an error that occurred when the PY1 sequence was entered into a computer data file, the length of the PY1 exon was previously reported to be 591 nt (4 )], whereas the human K1 exon used in both PY8 and PY9 is only 267 long. Also, the PY1 exon has low (14%) G+C content, and the PY8/PY9 exon has high (63%) G+C content. We are currently investigating the effects of exon length and G+C content on the efficiency of group II intron splicing (both forward and inverse).
We are also investigating whether intron domain 4 might contribute to the efficiency of inverse splicing under certain conditions. Previous work has shown that domain 4 is not essential for forward cis- or trans-splicing, though its deletion does slow the rate of the second step of splicing (3 ). None of the inverse splicing constructs utilized in the present study includes domain 4. When the exon in the inverse splicing construct is the wild-type yeast exon naturally associated with the intron (PY1), deletion of domain 4 has no apparent effect on the efficiency of inverse splicing as compared with cis-splicing of a template where the intron does have domain 4 (WT; see Table 1 ). Nonetheless, we feel that it is possible that the absence of domain 4 from our human-exon constructs (PY8 and PY9) contributes to their reduced splicing efficiency as a result of its role in the second step of splicing. The results described in this report show that, for PY8 and PY9, when the first step of inverse splicing occurs by hydrolysis, the second step of splicing is slow [i.e., the products of the first step, IVS 1-3 and (IVS 5,6)K1, accumulate to high levels]. We speculate that the absence of domain 4 contributes to the reduced efficiency of the second step of splicing in these constructs, so that it should be possible to increase the inverse splicing efficiency of these constructs by adding back the missing domain 4 sequences. We are currently testing that hypothesis.
The data presented in this report show that the EBS2-IBS2 interaction is not critical for inverse splicing, even when a heterologous exon is employed. In particular, the data show that PY9, which has a perfectly paired EBS2-IBS2 interaction, splices with an efficiency comparable to that of PY8, which has a mispaired EBS2-IBS2 interaction. This finding is consistent with previous observations that the EBS2-IBS2 interaction is not essential for forward (25 ) or inverse (our unpublished results) splicing with the wild-type intron and its natural yeast exons. We repeated the experiment in the context of the present study because our early experiments with PY8 indicated that it splices more slowly than does the wild-type construct (Table 1 ) and we considered the possibility that a more dominant role for the EBS2-IBS2 interaction might have emerged in our artificial construct. This study shows that the absence of a strong EBS2-IBS2 interaction is not the reason that the efficiency of PY8 splicing is reduced.
The rate data reported in this study must be considered in light of already-published data on group II intron splicing. Two recent reports have described kinetic analyses of the group II intron in vitro (18 ,19 ). Unfortunately, each of these studies used a different approach to analyzing the data, and the rates that they report for forward splicing of a wild-type group II intron differ significantly. Specifically, Boulanger and coworkers (18 ) used an equation (equation 1) that defined a specific percentage of the RNA as inactive, and then calculated a rate for the active fraction. They observed rates for the wild-type RNA of 0.3 min-1 and 0.18 min-1 in 1.5 M (NH4)2SO4 and 1.5 M KCl, respectively. In both cases, they found that just over half the RNAs were active. Daniels and coworkers (19 ) used an equation (equation 2) that calculates rates for fast and slow populations within the RNA, and also determines the fraction of RNA in each population. They found the wild-type RNA fast populations to splice at rates of 0.047 min-1 [this is their `fast population'. They also report that a small percentage (13%) of the wild-type RNA reacted in a `burst' with a rate of 0.22 min-1] and 0.028 min-1 (no burst was observed in this salt) in 0.5 M (NH4)2SO4 and 0.5 M KCl. They found ~40% of the RNA to be in the fast population in each case. Thus, Boulanger et al. observed rates ~5-10-fold faster than those of Daniels et al. The differences in salt conditions are unlikely to explain this disparity, since we find that raising the salt concentration from 0.5 to 1.5 M has only a modest effect on the rate of the splicing reaction (unpublished).
We analyzed our data using both the Boulanger approach and the Daniels approach. Using equation 1 we find rates [0.53 min-1 in 1.5 M (NH4)2SO4 and 0.17 min-1 in 1.5 M KCl] very similar to those calculated by Boulanger, but the equation does not fit our data well (we observe r2 values of 0.81 and 0.61, whereas Boulanger et al. reported r2 = 0.99). Equation 2 shows somewhat better fits with our data (r2 = 0.80 and 0.89) and gives fast population rates of 0.88 min-1 in 1.5 M (NH4)2SO4 and 0.32 min-1 in 1.5 M KCl, with about half of the RNA being in the fast population. Although we cannot explain the differences between the Boulanger and Daniels reports, we find that our data fit better with the Daniels equation (equation 2) but give rate coefficients very close to those of Boulanger et al.
As described in the Introduction, our goals in this study were two-fold: to identify reaction conditions that allowed efficient production of large amounts of circular RNAs, and also to identify reaction conditions that minimize inverse splicing as a competing reaction in gene assembly experiments. The data presented here provide the useful information that trans-splicing products, as compared with inverse splicing products, are most abundant when splicing precursors are incubated in either (NH4)2SO4 or NH4Cl buffer. However, inverse splicing is by no means abolished under these conditions. We are therefore investigating alternative methods by which we can inhibit inverse, but not trans-, splicing. In experiments that will be described elsewhere, we have found that inverse splicing can be completely blocked by the addition of a specific antisense RNA. This finding ties in neatly with our investigation of the in vivo role for circular RNAs, as it is possible that inverse splicing reactions can be utilized to provide stable antisense RNAs in vivo that can then affect the splicing patterns of other transcripts for which competing reactions reduce production of a desired spliced product.
We thank Brenda Jarrell for editing of the manuscript. This work was supported by National Science Foundation Grant MCB 9604458 and National Institutes of Health Grant GM52409-01A1.
1 Peebles, C. L., Perlman, P. S., Mecklenburg, K. L., Petrillo, M. L., Tabor, J. H., Jarrell, K. A. and Cheng, H. L. (1986) Cell, 44, 213-223.MEDLINE Abstract
2 van der Veen, R., Arnberg, A. C., van der Horst, G., Bonen, L., Tabak, H. F. and Grivell, L. A. (1986) Cell, 44, 225-234.MEDLINE Abstract
3 Jarrell, K. A., Dietrich, R. C. and Perlman, P. S. (1988) Mol. Cell. Biol., 8, 2361-2366.MEDLINE Abstract
4 Jarrell, K. A. (1993) Proc. Natl. Acad. Sci. USA, 90, 8624-8627.MEDLINE Abstract
5 Jarrell, K. A., Peebles, C. L., Dietrich, R. C., Romiti, S. L. and Perlman, P. S. (1988) J. Biol. Chem., 263, 3432-3439.MEDLINE Abstract
6 Capel, B., Swain, A., Nicolis, S., Hacker, A., Walter, M., Koopman, P., Goodfellow, P. and Lovell-Badge, R. (1993) Cell, 73, 1019-1030.MEDLINE Abstract
7 Nigro, J. M., Cho, K. R., Fearon, E. R., Kern, S. E., Ruppert, J. M., Oliner, J. D., Kinzler, K. W. and Vogelstein, B. (1991) Cell, 64, 607-613.MEDLINE Abstract
8 Cocquerelle, C., Mascrez, B., Hetuin, D. and Bailleul, B. (1993) FASEB J., 7, 155-160.MEDLINE Abstract
9 Bohjanen, P. R., Colvin, R. A., Puttaraju, M., Been, M. D. and Garcia-Blanco, M. A. (1996) Nucleic Acids Res., 24, 3733-3738.MEDLINE Abstract
10 Puttaraju, M., Perrotta, A. T. and Been, M. D. (1993) Nucleic Acids Res., 21, 4253-4258.MEDLINE Abstract
16 Doolittle, R. F. (1995) Annu. Rev. Biochem., 64, 287-314.MEDLINE Abstract
17 Ny, T., Elgh, F. and Lund, B. (1984) Proc. Natl. Acad. Sci. USA, 81, 5355-5359.MEDLINE Abstract
18 Boulanger, S. C., Faix, P. H., Yang, H., Zhuo, J., Franzen, J. S., Peebles, C. L. and Perlman, P. S. (1996) Mol. Cell. Biol., 16, 5896-5904.MEDLINE Abstract
19 Daniels, D. L., Michels, W. J.,Jr and Pyle, A. M. (1996) J. Mol. Biol., 256, 31-49.MEDLINE Abstract
20 Sambrook, J., Fritsch, E. F. and Maniatis, T. (eds) (1989) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
21 Peebles, C. L., Benatan, E. J., Jarrell, K. A. and Perlman, P. S. (1987) Cold Spring Harbor Symp. Quant. Biol., 52, 223-232.MEDLINE Abstract
22 Freier, S. M., Kierzek, R., Jaeger, J. A., Sugimoto, N., Caruthers, M. H., Neilson, T. and Turner, D. H. (1986) Proc. Natl. Acad. Sci. USA, 83, 9373-9377.MEDLINE Abstract
23 Jacquier, A. and Jacquesson-Breuleux, N. (1991) J. Mol. Biol., 219, 415-428.MEDLINE Abstract
24 Perlman, P. S. and Podar, M. (1996) Methods Enzymol., 264, 66-86.MEDLINE Abstract
26 Ruskin, B. and Green, M. R. (1985) Science, 229, 135-140.MEDLINE Abstract
27 Harland, R. and Misher, L. (1988) Development, 102, 837-852.MEDLINE Abstract
28 Ford, E. and Ares, M.,Jr (1994) Proc. Natl. Acad. Sci. USA, 91, 3117-3121.MEDLINE Abstract
29 Puttaraju, M., Beebe, J. A., Niranjanakumari, S., Been, M. D. and Fierke, C. A. (1995) Nucleic Acids Symp. Ser., 33, 92-94.MEDLINE Abstract
30 Puttaraju, M. and Been, M. D. (1995) Nucleic Acids Symp. Ser., 33, 152-155.MEDLINE Abstract
31 Davies, R. W., Waring, R. B., Ray, J. A., Brown, T. A. and Scazzocchio, C. (1982) Nature, 300, 719-724.MEDLINE Abstract
32 Cech, T. R., Tanner, N. K., Tinoco, I.,Jr, Weir, B. R., Zuker, M. and Perlman, P. S. (1983) Proc. Natl. Acad. Sci. USA, 80, 3903-3907.MEDLINE Abstract
33 Waring, R. B., Scazzocchio, C., Brown, T. A. and Davies, R. W. (1983) J. Mol. Biol., 167, 595-605.MEDLINE Abstract
34 Doudna, J. A., Cormack, B. P. and Szostak, J. W. (1989) Proc. Natl. Acad. Sci. USA, 86, 7402-7406.MEDLINE Abstract
35 Michel, F., Umesono, K. and Ozeki, H. (1989) Gene, 82, 5-30.MEDLINE Abstract
Y. Zhuang, F. Ma, J. Li-Ling, X. Xu, and Y. Li Comparative Analysis of Amino Acid Usage and Protein Length Distribution Between Alternatively and Non-alternatively Spliced Genes Across Six Eukaryotic Genomes
Mol. Biol. Evol.,
December 1, 2003;
20(12):
1978 - 1985.
[Abstract][Full Text][PDF]