Nucleic Acids Research Advance Access originally published online on February 7, 2007
Nucleic Acids Research 2007 35(6):e38; doi:10.1093/nar/gkm017
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2007, Vol. 35, No. 6 e38
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Methods Online |
Motif programming: a microgene-based method for creating synthetic proteins containing multiple functional motifs
1Department of Protein Engineering, Cancer Institute, Japanese Foundation for Cancer Research, Koto-ku, Tokyo 135-8550, Japan and 2CREST, Japan Science and Technology Agency (JST), c/o Cancer Institute
*To whom correspondence should be addressed. Tel: +81 3 3570 0489; Fax: +81 3 3570 0461; Email: kshiba{at}jfcr.or.jp
Received October 18, 2006. Revised December 20, 2006. Accepted January 2, 2007.
| ABSTRACT |
|---|
|
|
|---|
The presence of peptide motifs within the proteins provides the synthetic biologist with the opportunity to fabricate novel proteins through the programming of these motifs. Here we describe a method that enables one to combine multiple peptide motifs to generate a combinatorial protein library. With this method, a set of sense and antisense oligonucleotide primers were prepared. These primers were mixed and polymerized, so that the resultant DNA consisted of combinatorial polymers of multiple microgenes created from the stochastic assembly of the sense and antisense primers. With this motif-mixing method, we prepared a protein library from the BH1-4 motifs shared among Bcl-2 family proteins. Among the 41 clones created, 70% of clones had a stable, presumably folded expression product in human cells, which was detectable by immunohistochemistry and western blot. The proteins obtained varied with respect to both the number and the order of the four motifs. The method enables homology-independent polymerization of DNA blocks that coded motif sequences, and the frequency of each motif within a library can be adjusted in a tailor-made manner. This motif programming has a potential for creating a library with a large proportion of folded/functional proteins.
| INTRODUCTION |
|---|
|
|
|---|
Investigators in the emerging field of synthetic biology seek an understanding of biological systems that would be difficult to achieve using more conventional approaches and to construct novel systems and biomacromolecules that exhibit unparalleled behaviors, potentially leading to the development of new technologies (1). The artifacts synthesized include biomolecules (proteins, DNA or RNA) (2,3), replicators (4), gene circuits (5,6) and genomes (7), all of which have complex and dynamic structures composed of rather simpler block units. Natural versions of these entities are known to have been organized to their existing forms over the course of billions of years of evolution. Our knowledge about the principles governing this organizing process is limited, however, which is why the rational design of biological systems is still at a very rudimentary stage, and which is why the act of synthesis can deepen our understanding of the self-organizing principles involved.
In the in vitro evolution experiments that were conducted in the early 1990s, synthetic molecules were created by adopting a combinatorial approach (8) in which blocks of nucleic acids or amino acids were randomly assembled to prepare pools of random sequences, and functional molecules were selected from those pools. Although numerous nucleic acids-based enzymes have been created using such random sequence approaches, the emergence of novel proteins has been limited to a few examples (9). In contrast, exon-shuffling type or constrained DNA libraries constructed using biased sequences have proven to be more effective than random sequences for generating artificial proteins. The numerous approaches that have made use of biased libraries (10–16) can be generally classified into two groups: those requiring short DNA sequences that are commonly shared amongst DNA blocks for recombination and those enabling homology-independent polymerization. Homology-independent recombination of molecular units has proven to be a powerful tool for evolving novel structures and functions (12–16). Ideally, with this approach, the reaction conditions should be simple, a desired number of non-homologous genes should be used to construct the libraries, and the libraries should be user programmable.
We previously established a simple protein evolution system in which artificial proteins were synthesized from the combinatorial polymerization of peptide motifs (17–20). In our original protocol, motifs to be mixed were first embedded within different reading frames encoded by a single short DNA sequence, a microgene (Figure 1). Next, from the designer microgene, a pair of MPR (microgene polymerization reaction) primers was synthesized such that (i) the pair contained complementary bases in their 3' region; (ii) the sequence of the primer dimer obtained from the elongation reaction re-created the microgene block; and (iii) the primers had mismatched bases at their 3'-OH ends (shown by red letters with dots in Figure 1). These mismatched nucleotides at the 3'-OH ends of MPR primers are critical for successful polymerization of the microgene (17). During MPR, the primer pairs are reacted under conditions similar to PCR—i.e. a thermal cycle reaction is repeated in the presence of a thermostable archaeal DNA polymerase and dNTP (but without any template DNA). Without the 3'-OH end mismatch, the primer dimer that corresponds to one unit of the microgene would be amplified; including the 3'-OH end mismatch in the MPR primers and using a DNA polymerase having 3'–5' exonuclease activity enables large DNAs consisting of tandem repeats of the microgene to be synthesized. Moreover, nucleotide insertions and deletions randomly occur at end-joining junctions between microgenes, resulting in synthesis of combinatorial libraries of the three reading frames (motifs) from a single microgene. Although the detailed mechanism of MPR is not yet known, the reaction is apparently related to illegitimate recombination in which double-strand breaks in the DNA are joined. It is noteworthy in that regard that DNA polymerases such as the Klenow fragment of DNA polymerase I and Taq polymerase can serve as alignment proteins that juxtapose the two DNA ends so that DNA synthesis can proceed on discontinuously aligned DNA (21).
|
Using this MPR method, we have previously reported that translations of microgene polymers, in which
-helix or ß-sheet forming peptides were encrypted, produced proteins having secondary structures (18). In addition, we have demonstrated that bifunctional proteins that penetrate through cell membranes and exert a pro-apoptotic effect can be generated by combinatorially polymerizing two short peptide motifs respectively related to induction of apoptosis and protein transduction. Because simple linkage of these motifs was not sufficient to create a bifunctional peptide, and the successful reconstitution was dependent on how these motifs were joined together, the combinatorial polymerization strategy was shown to be important for reconstitution of function from mixtures of short sequence motifs (20). The original MPR method had an inherent limitation, however: the motif number was restricted to the three reading frames, and the creation of any combinatorial library was dependent on a randomly occurring frameshift (Figure 1). Because combinatorics created in this way originate from frame shift mutations that randomly occur at junctions of microgene polymers, the motifs to be mixed are embedded in different reading frames of the microgene. Consequently, the number of motifs that can be mixed is limited to three, the number of reading frames. In addition, random switching between reading frames is indispensable to the creation of combinatorics in this protocol (17–20). Here we describe a new motif-mixing protocol (outlined in Figure 2) that overcomes these inherent limitations and enables polymerization of more than three motifs. With this method, segments of microgenes (microgenescore) are first designed so that they encode peptide motifs. Thereafter, multiple MPR sense and antisense primers are designed based on these microgenescore. Because these primers have sequences that allow formation of base pairs in their 3' regions, they can stochastically re-create multiple microgenes, so that when MPR is carried out, combinatorial polymers of multiple microgenes are generated. We applied this method to construct a combinatorial protein library from mixtures of four different peptide motifs (BH1–4). These short peptide sequences are conserved among Bcl-2 family proteins, which constitute a critical checkpoint in the intracellular signaling network regulating the process of mitochondria-dependent apoptosis (22,23). The proteins obtained with different length (68–250 amino acids) varied with respect to both the number and order of the four motifs. The frequency and the ratio of each motif within a library could be controlled in a tailor-made manner. We also demonstrated that these proteins were effectively expressed in human cells, and localized in the mitochondria in some cases.
|
| MATERIALS AND METHODS |
|---|
|
|
|---|
Library construction
Microgenes were designed so that they fulfilled the following rules: (i) none contained translation termination codons in any of their three reading frames; (ii) codons were chosen so that the microgenes would code for peptides having a propensity to form
helix in the third reading frame; and (iii) all had similar GC contents (55–65%). For the experiments schematically depicted in Figure 3, 0.4 µM sense- and antisense MPR primers, four dNTP (0.35 mM each) and 2.6 units of 3'–5' exo+Vent DNA polymerase (New England Biolabs) were mixed in reaction buffer (10 mM KCl, 10 mM (NH4)2SO4, 20 mM Tris-HCl, 2 mM MgSO4, 0.1% TritonX-100, pH 8.8). The thermal program was initiated with 10 min at 94°C; polymerization was accomplished with 40 cycles of 94°C for 10 s and 55°C for 1 min, and terminated with 7 min at 69°C. The resultant microgene polymers were analyzed by agarose (1%) gel electrophoresis. Similar conditions were used for the experiments in Figure 4 and Table 1 except that cycling protocol included 1 min at 72°C instead of 55°C. To monitor the incorporation of motifs into the polymers, MPR products were digested with appropriate restriction enzymes (see the Results section) and analyzed by agarose (1%) gel electrophoresis.
|
|
|
Cloning and expression of microgene polymers
Microgene polymers were directly cloned into the pcDNA3.1 directional TOPO expression vector (Invitrogen), which enables the directional ligation of DNA fragments with CACC tetranucleotides at their 5' end. This vector was then used to initiate translation at an initiation codon (ATG) located at position 5–7 of the first microgene unit within each polymer. The ligated plasmids were then used to transform E. coli TOP-10 cells (Invitrogen), after which the cloned sequences were determined using a CEQ2000XL DNA analyzer (Beckman). We noticed that clones in pool-a unintentionally contained sequences derived from the antisense primer BH3Bcl-xL-AS (5'-GGGAAGCTTGAATTCGTCGCCGGCTTCGCGCAAGCCCCGCCA-3') (5 out of 13 clones). However, the embedded BH3 motif from Bcl-xL only appeared in clone a11; other clones translated the second and/or third reading frames. The sequenced microgene polymers were then cut from the original vector using BamHI and EcoRV, and sub-cloned into one of the three vectors (pcDNA 3.1/myc-His A, B or C; Invitrogen) to add a myc epitope and a poly-histidine tag at the C-terminal ends of the microgene products, and the resultant plasmids were transfected into MCF-7 cells using lipofectamine 2000, according to the manufacturer's instructions (Invitrogen).
Cell lines
MCF-7 (a human breast cancer line) cells were cultured in RPMI 1640 (GIBCO) supplemented with 10% fetal bovine serum (FBS, Morigate) and antibiotic/antimycotic solution (SIGMA, A5955) at 37°C in humidified air containing 5% CO2.
Immunohistochemical analysis
For the immunohistochemical analysis in Figure 5C, cells were fixed in methanol for 10 min at room temperature. The fixed cells were washed by PBS and then incubated in blocking solution (PBS with 10% goat serum) for 1 h at room temperature. The cells were then incubated for 1 h at 37°C with anti-penta-his antibody with Alexa Fluor 488 conjugate (1:200, QIAGEN). Stained samples were examined using a confocal laser scanning microscope.
|
| RESULTS |
|---|
|
|
|---|
Outline of motif programming
The procedure for mixing more than three motifs explored in this study consists of three processes (Figure 2). In the first two processes microgenescore that each encode a peptide motif (Motif A–D) in its first reading frames were designed, after which sense and antisense MPR primers were synthesized based on those microgenescore [in Figure 2, Motifs A and B were embedded in the sense primer (AS and BS), while motifs C and D were embedded in the antisense primers (CAS and DAS)]. These primers also contained additional sequences at their 3' ends that allowed formation of base pairs between sense and antisense primers, but contained mismatched bases at their 3'-OH ends (indicated by red letters with dots). In the third process, thermal cycling was carried out with the MPR primers, a thermostable DNA polymerase having 3'–5' exonuclease activity and dNTP, which was shown to efficiently polymerize a microgene unit created from the sense and antisense MPR primers. With this new protocol, multiple microgenes were created by stochastic base pairing of the MPR primers, enabling their combinatorial polymerization. In the example shown in Figure 2, two sense (AS and BS), and two antisense (CAS and DAS) MPR primers yield four distinct microgene units (containing motif-dimers A–C, A–D, B–C and B–D) that are combinatorially polymerized in the MPR process. Alternatively, three microgene units, A–B, A–C, and A–D could be formed by using one sense primer (AS) and three antisense primers (BAS, CAS and DAS).
We initially assessed the practicality of the new protocol using four short (16–17 nt) MPR primers (Figure 3A). Six bases (GGCGGG) in the 3' region of the sense primer-A (AS) were complementary to the 3' regions of antisense primers-B, C and D (BAS, CAS and DAS) and, therefore, pairwise combination yielding A–B, A–C and A–D should stochastically occur. We then mixed these primers in the combinations shown in Figure 3B and ran the MPR protocol to obtain high molecular weight DNAs (Figure 3B, lanes 1–6). A pair of MPR primers (both sense and antisense) were facilitated to synthesize large DNAs (lanes 2–6). In this pilot experiment, we also introduced restriction endonuclease recognition sequences for EcoRI, HindIII, BglII and XbaI into primers-AS, -BAS, -CAS and -DAS, respectively, which enabled us to distinguish the DNA units incorporated into the high molecular weight DNAs by simple restriction enzyme digestion. If the DNAs prepared from primers-AS and -BAS were digested into small fragments by HindIII but not by BglII or XbaI, it would indicate that the polymers contained the primer-BAS unit (Figure 3C lanes 1–4). Similarly, polymers comprising combinations of primer-AS and -CAS or -DAS were digested only by BglII or XbaI, respectively (lanes 5–12), polymers made from a mixture of primers-AS, -BAS and -CAS were digested by both HindIII and BglII (lanes 13–16), and polymers made from all four primers were digested by HindIII, BglII and XbaI (lanes 17–20). We then cloned the high molecular weight DNAs into a vector and determined the DNA sequences of some clones, which confirmed that these DNAs were indeed polymers of the three microgenes created from combinations of the four primers used (data not shown). These results indicated that multiple microgenes can be combinatorially polymerized using multiple MPR primers.
BH library construction
We next used the newly developed protocol to construct artificial protein libraries containing mixtures of BH1–4 peptide motifs. Although BH1–4 are unambiguously conserved between Bcl-2 family proteins, there is diversity among the identities of the amino acids in each motif. For BH1, BH2 and BH4, we extracted motif sequences composed of 8 amino acids from human Bcl-xL (24), and for BH3 we extracted a 9-amino-acid sequence from human Noxa (25) (BH1core–BH4 core, Figure 4A and B). We chose the BH3Noxa motif to construct a library because the simple conjugation of BH3Noxa and the protein transduction domain of Tat protein (PTDTat) has been shown to penetrate into cells but fail to induce apoptosis, whereas a combinatorial library constructed from these motifs contained bifunctional proteins (20). We initially used these natural amino acid sequences to design a set of microgenescore that independently encoded BH1core–BH4core in their first reading frames (Figure 4B). Degeneracy in the genetic code allowed us to choose a set of codons such that the other two frames of the genes did not contain any termination codons, and the peptides encoded by the third reading frame had a propensity to form
-helical structures, which we expected would help the structural formation of synthetic proteins (20). To design such microgenes, we have previously developed the microgene design program CyberGene (19). Using CyberGene, appropriate sequences with the above criteria have been selected from all possible sequences that coded the motifs in silico. We then synthesized MPR primers based on these core sequences (Figure 4C). The core sequences that coded for the BH motifs were flanked by 3' association sequences (5'-CGGCGGGGA-3' and 5'-GCCCCGCCA-3' for the sense and antisense primers, respectively) and recognition sites for restriction endonucleases, and a CACCATG sequence at the 5'-teminus of the sense primers enabled directional cloning and provided a translation initiation codon, yielding primers BH3S, BH4S, BH1AS, BH2AS and BH3AS (Figure 4C). We also synthesized derivatives of antisense primers that had an extra CC at their 5' termini (BH1AS+, BH2AS+ and BH3AS+) so that the reconstituted microgenes would have lengths that were multiples of three. Polymerization of such microgenes would maintain the same reading frame unless frameshift mutations randomly occur at junctions of microgene polymers. Therefore, we expected that a library derived from BHAS+ primers would contain more peptide motifs compared with that derived from BHAS primers. Addition of these appendix sequences did not create termination codons in the microgenes created (Figure 4D). Thus, the BHcore-coding sequences were connected to 3' association sequences and restriction endonuclease sites, producing a set of sense (BH3S and BH4S) and antisense (BH1AS(+), BH2AS(+) and BH3AS(+)) primers (Figure 4C).
Using these MPR primers, we tested four conditions for polymerization by changing the ratios and lengths of the primers (Table 1). In the first condition (pool-a), BH1AS, BH2AS, BH3AS and BH4S primers were mixed at a ratio of 2:1:100:100. With this combination of primers, each re-created microgene should contain the BH4core sequence because each of the three antisense primers (BH1-3 AS) must associate with the BH4S primer for the MPR polymerization to proceed. Confirming this configuration, digestion of the resultant high molecular weight DNAs using SalI, which cut the BH4 unit, yielded small DNA fragments whose sizes corresponded to single microgene units (Figure 4E, lane 1 versus 2). Because the association of the three antisense primers with BH4S was basically a stochastic event, we expected that the microgene containing BH3core would predominate over those containing BH1core or BH2core, as there was 50 and 100-fold more BH3AS present than BH1AS or BH2AS, respectively. As expected, digestion of microgene polymers by HindIII (a marker of BH3AS) yielded smaller DNA fragments than did EcoRI or BglII (markers of BH1AS and BH2AS, respectively) (Figure 4E, lanes 3–5), indicating a high content of BH3core among the polymers.
In the second condition (pool-b), we used two sense primers (BH3S and BH4S) and two antisense primers (BH1AS and BH2 AS), in which stochastic associations of the four primers would yield four types of microgenes: BH3–BH1, BH3–BH2, BH4–BH1 and BH4–BH2 (Figure 4D). Confirming this condition did indeed polymerize all four microgenes, we observed efficient digestion of microgene polymers by SalI (BH4S), XhoI (BH3S), EcoRI (BH1AS) and BglII (BH2AS) (Figure 4E, lanes 6–10). In the third (pool-c) and fourth conditions (pool-d), we used three antisense (BH1AS+, BH2AS+, BH3AS+) and one sense primers (BH4S). In these conditions, we were able to regulate the frequency of each motif within a library by changing the concentrations of the MPR primers. For instance, when we mixed BH1AS+, BH2AS+ and BH3AS+ at a ratio of 1:1:1 with BH4S (pool-c, Table 1), endonuclease digestion indicated that BH2core (BglII) was most abundant within the polymers, while little BH3core (HindIII) was present (Figure 4E, lanes 12–14). This was confirmed by analyzing the sequences of randomly chosen clones (Figures 6A and 5A, pool c). In contrast, when we mixed BH1AS+, BH2AS+ and BH3AS+ at a ratio of 1:0.75:5 (pool-d, Table 1) to increase the ratio of BH3core and decrease the ratio of BH2core in the pool, both restriction endonuclease digestion (Figure 4E, lanes 16–18) and analysis of the motif content (Figures 6A and 5A, pool d) confirmed BH3core to be the most abundant sequences within the polymers. Thus, our method enables homology-independent polymerization of DNA blocks, in which the frequency of each block within a library can be adjusted by changing the concentrations of MPR primers.
|
Properties of synthetic proteins
From the prepared libraries, we randomly selected 41 clones (13, 5, 10 and 13 clones from pools-a, -b, -c and -d, respectively) and determined their DNA sequences. The predicated polypeptides contained various combinations of the four BH motifs translated from 68 to 250 codons (Figure 6A). In the experiments in which pools-a and -b were prepared, the reading frame of the polymers was altered at every junction between the microgene units, unless the junction contained insertion/deletion mutations, because the lengths of the microgenes were not multiples of three. By contrast, the microgenes in pools-c and -d were designed to maintain the reading frame throughout the polymers (Figure 4C and Table 1). Reflecting this design, clones obtained from pools-a and -b contained fewer BH motifs (55/18 = 3.1 motifs per clone) than those from pools-c and -d (134/23 = 5.8 motifs per clone) (Figure 6A).
We next transfected each of the 41 clones into MCF-7 cells to investigate the expression profiles of the artificial proteins. After sub-cloning each polymer into an expression vector that added a myc epitope and a poly-histidine tag at its C-terminus, we transfected the polymers into MCF-7 cells and then detected the translated products using an anti-myc antibody. Of the 41 clones, 28 (
70%) yielded proteins that were immunohistochemically detectable (Figure 6, clones with black bold bars). We also analyzed expressions of randomly selected 10 clones (a8, a10, a12, c17, d5, d11, d16, d19, d26, d29) that were immunohistochemically detectable by western blotting, confirming that predicted polypeptides were expressed in cells (Figure 6B and data not shown). It is noteworthy that clones obtained from pools-c and -d, which contained fewer peptides encoded by the second and third reading frames, were detected at higher rates than peptides obtained from pools-a and -b (44 versus 87%).
The intriguing observation was the mitochondrial localizations of the synthetic a8 protein (Figure 5B and C). 24 h after transfecting the a8 plasmid into MCF-7 cells, cells stained with the anti-penta-his antibody (green) and MitotrackerTM (red) exhibited overlapping distributions of a8 and mitochondria, indicating that the a8 protein associated with the mitochondria (Figure 5C). The good correlation between a8 and mitochondria was also confirmed by immunofluorescence using the anti-c-myc antibody (data not shown). Because we did not incorporate a mitochondrial targeting sequence into the designed microgenes, it is likely that the a8 have either acquired a synthetic signal sequence or interacted with proteins that target to mitochondria. We also observed that aX (a10, a12 and a14) proteins localized in mitochondria, but some clones from different pools (b–d) localized in different organelles (data not shown). Although further analyses are required to investigate the structural and functional properties of these proteins, it is noteworthy that synthetic proteins made by the motif programming method can be effectively expressed in human cells and localized in the organelle in some cases.
| DISCUSSION |
|---|
|
|
|---|
Our synthesis of artificial proteins entails a hierarchical approach to in vitro protein evolution in which peptide blocks (microgenes) are combinatorially polymerized to construct a protein library. This approach contrasts with the first generation of in vitro evolution systems in which a large pool of random sequences prepared by combinatorial polymerization of nucleotide or amino acid units were used as naïve libraries (8). Although the selection from naïve sequences strategy has enabled creation of peptide binders (26), catalytic peptides (27) and even a small protein (9), the larger, modular structures of existing natural proteins suggest that they did not directly arise from random sequences, but developed hierarchically from assemblages of smaller primordial genetic units; that is to say, primordial microgenes endowed with rudimentary activities initially emerged from naïve sequences, after which these microgenes served as building blocks for the larger, more exquisite genes that evolved from their combinatorial assemblage. The exon theory of genes, which proposes that polymerization of exons via their flanking introns (exon shuffling) gave rise to sets of genes, is fully compatible with this notion (28).
In our system, we used peptide blocks (motifs) that were preliminarily correlated with a particular function or structure and then combined to make biased libraries. These peptide blocks can include (i) motifs identified from natural proteins; (ii) motifs artificially created by the first generation evolution system; and (iii) motifs rationally designed through protein engineering. We named this operation motif programming. In some ways, motif programming resembles protein engineering in which a rational de novo design of a novel protein is sought. However, previous studies by others and ourselves have shown that when it comes to manifesting a desired function, peptide motifs are capricious, as they are strongly influenced by their context within the artificial proteins (20,29). Therefore, with our incomplete knowledge on the structure–function relationships of proteins, the combinatorial or irrational approach is still very important (10). But while motif programming emphasizes the selection from libraries, because it starts with biased libraries, not naïve ones, the size of library to be screened should be small compared to those in first generation evolution systems.
Several methods have been developed to shuffle DNA blocks to make protein libraries. For instance, Stemmer's DNA shuffling has been widely used to improve the properties of existing enzymes, cytokines etc. (11). This method is essentially in vitro homologous recombination among a family of genes and DNA blocks that are difficult to combinatorially polymerize—i.e. it can create polymers of A–B'–C from recombination between A—B–C and A'–B'–C, but cannot produce B'–C'–A or A'–C'–C–A. In the years following Stemmer's innovation, several other methods were proposed to enable combinatorial polymerization of DNA blocks. These methods are sketchily classified into two groups: one that requires short DNA sequences commonly shared among DNA blocks for recombination, and one that enables homology-independent polymerization. The latter includes SHIPREC (13), ITCHY (14), Y-ligation (15), NRR (16) and our MPR method (17). The differences between MPR and other methods are (i) the reaction conditions of MPR are rather simple and (ii) the proteins created by MPR are repetitious, which seems to contribute to the emergence of structured proteins (18). Originally, MPR had an inherent restriction, however: the number of motifs that could be embedded was limited to three, the number of reading frames, and the creation of any combinatorial library was dependent on a randomly occurring frameshift (Figure 1). Although we have previously succeeded in synthesizing the functional proteins using MPR method (20), these characteristics of MPR may limit its application potential to generate synthetic proteins with diverse functions in some cases. To overcome that limitation, we developed a new method for microgene polymerization in which a desired number of MPR primers are used to stochastically create two or more microgenes in a single reaction tube. This, in turn, enables polymerization of more than three motifs (Figure 2). Since we have previously demonstrated that ß-sheet or copper-binding peptide motifs could be incorporated into synthetic proteins by the original MPR method (18, 19), the scope of the new method is not explicitly restricted to
-helical motifs and proteins. Therefore, in principle, the method can use any peptide sequences for generating a protein library. Moreover, the method can control the frequency of each motif within a library by simply changing the concentration of MPR primers. Such a tailor-made library had not been obtained by the methods reported previously. The new method can also increase the number of different sequences (complexity) in a library compared to the former MPR method. For example, polymers consisting of 10 mer of the single microgene (Figure 1) could have 310 =
0.6 x 105 molecular diversity. In contrast, polymers consisting of 10 mer of four microgenes (Figure 2) could have (4 x 3)10 =
6.2 x 1010 diversity. Although it seems that random frameshifts are a rather rare event in this study, it is possible to increase the complexity by using a mixture of antisense primers (BH1-3AS and BH1-3AS+). Because we can incorporate desired number of motifs within a library, this method, by combining with a high-throughput screening, may represent a novel system for synthesizing functional proteins.
Combinatorics of motifs in the new protocol are attributed to (i) the combinations of MPR primers used to create microgenes and (ii) polymerization of microgenes that employs illegitimate recombination by DNA polymerase (Figure 2). With the restriction on motif number removed, we made use of microgene designs in which combinations of BH4–BH1, BH4–BH2 and BH4–BH3 (pools -a, -c and -d) and combinations of BH4–BH1, BH4–BH2, BH3–BH1 and BH3–BH2 (pool-b) were used to create protein libraries (Figure 4D). This means that the polymers obtained were not the result of random shuffling of four BH motifs but of three (pools -a, -c and -d) or four (pool -b) motif heterodimers. If combinatorics of four motifs was needed, one could design microgenes so that (i) each microgene would encode one motif, or (ii) a microgene could be formed from four sense and four antisense MPR primers encoding four sense and antisense motifs, respectively. We are currently working to generate such a library. In conclusion, we invented a new synthetic method for protein creation, in which polymers of multiple peptide motifs are combinatorially assembled. The method enables homology-independent polymerization of DNA blocks, and the frequency of each block within a library can be adjusted in a tailor-made manner. Therefore, the method represents the potential system to create de novo folded synthetic proteins with desired functionalities.
| ACKNOWLEDGEMENTS |
|---|
We thank D.B. Murray for his critical proofreading of the manuscript. This work was supported in part by a grant-in-aid scientific research, Ministry of Education, Culture, Sports, Science and Technology. H.S. acknowledges the JSPS Research Fellowships for Young Scientists. Funding to pay the Open Access publication charge was provided by Japan Science and Technology Agency.
Conflict of interest statement. None declared.
| Footnotes |
|---|
Present address: Hirohide Saito, Laboratory of Gene Biodynamics, Graduate School of Biostudies, Kyoto University, Oiwake-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8502, Japan
| REFERENCES |
|---|
|
|
|---|
- Benner, SA. 2003Synthetic biology: act natural Nature 421118[CrossRef][Medline]
- Szostak, JW. 1992In vitro genetics Trends Biochem. Sci 1789–93[CrossRef][ISI][Medline]
- Saito, H, Kourouklis, D, Suga, H. 2001An in vitro evolved precursor tRNA with aminoacylation activity EMBO J 201797–1806[CrossRef][ISI][Medline]
- Lee, DH, Granja, JR, Martinez, JA, Severin, K, Ghadri, MR. 1996A self-replicating peptide Nature 382525–528[CrossRef][Medline]
- Elowitz, MB and Leibler, S. 2000A synthetic oscillatory network of transcriptional regulators Nature 403335–338[CrossRef][Medline]
- Gardner, TS, Cantor, CR, Collins, JJ. 2000Construction of a genetic toggle switch in Escherichia coli Nature 403339–342[CrossRef][Medline]
- Smith, HO, Hutchison, CA, III, CA, Pfannkoch, C, Venter, JC. 2003Generating a synthetic genome by whole genome assembly: phiX174 bacteriophage from synthetic oligonucleotides Proc. Natl. Acad. Sci. U.S.A 10015440–15445
[Abstract/Free Full Text] - Wilson, DS and Szostak, JW. 1999In vitro selection of functional nucleic acids Annu. Rev. Biochem 68611–647[CrossRef][ISI][Medline]
- Keefe, AD and Szostak, JW. 2001Functional proteins from a random-sequence library Nature 410715–718[CrossRef][Medline]
- Hecht, MH, Das, A, Go, A, Bradley, LH, Wei, Y. 2004De novo proteins from designed combinatorial libraries Protein Sci 131711–1723
[Abstract/Free Full Text] - Stemmer, WP. 1994Rapid evolution of a protein in vitro by DNA shuffling Nature 370389–391[CrossRef][Medline]
- Hiraga, K and Arnold, FH. 2003General method for sequence-independent site-directed chimeragenesis J. Mol. Biol 330287–296[CrossRef][ISI][Medline]
- Udit, AK, Silberg, JJ, Sieber, V. 2003Sequence homology-independent protein recombination (SHIPREC) Methods Mol. Biol 231153–163[Medline]
- Ostermeier, M, Shim, JH, Benkovic, SJ. 1999A combinatorial approach to hybrid enzymes independent of DNA homology Nat. Biotechnol 171205–1209[CrossRef][ISI][Medline]
- Kitamura, K, Kinoshita, Y, Narasaki, S, Nemoto, N, Husimi, Y, Nishigaki, K. 2002Construction of block-shuffled libraries of DNA for evolutionary protein engineering: Y-ligation-based block shuffling Protein Eng 15843–853
[Abstract/Free Full Text] - Bittker, JA, Le, BV, Liu, JM, Liu, DR. 2004Directed evolution of protein enzymes using nonhomologous random recombination Proc. Natl. Acad. Sci. U.S.A 1017011–7016
[Abstract/Free Full Text] - Shiba, K, Takahashi, Y, Noda, T. 1997Creation of libraries with long ORFs by polymerization of a microgene Proc. Natl. Acad. Sci. U.S.A 943805–3810
[Abstract/Free Full Text] - Shiba, K, Takahashi, Y, Noda, T. 2002On the role of periodism in the origin of proteins J. Mol. Biol 320833–840[CrossRef][ISI][Medline]
- Shiba, K. 2004MolCraft: a hierarchical approach to the synthesis of artificial proteins J. Mol. Catal. B 28145–153[CrossRef]
- Saito, H, Honma, T, Minamisawa, T, Yamazaki, K, Noda, T, Yamori, T, Shiba, K. 2004Synthesis of functional proteins by mixing peptide motifs Chem. Biol 11765–773[CrossRef][ISI][Medline]
- King, JS, Fairley, CF, Morgan, WF. 1996DNA end joining by the Klenow fragment of DNA polymerase I J. Biol. Chem 27120450–20457
[Abstract/Free Full Text] - Strasser, A. 2005The role of BH3-only proteins in the immune system Nat. Rev. Immunol 5189–200[CrossRef][ISI][Medline]
- Opferman, JT and Korsmeyer, SJ. 2003Apoptosis in the development and maintenance of the immune system Nat. Immunol 4410–415[CrossRef][ISI][Medline]
- Sattler, M, Liang, H, Nettesheim, D, Meadows, RP, Harlan, JE, Eberstadt, M, Yoon, HS, Shuker, SB, Chang, BS, et al. 1997Structure of Bcl-xL-Bak peptide complex: recognition between regulators of apoptosis Science 275983–986
[Abstract/Free Full Text] - Oda, E, Ohki, R, Murasawa, H, Nemoto, J, Shibue, T, Yamashita, T, Tokino, T, Taniguchi, T, Tanaka, N. 2000Noxa, a BH3-only member of the Bcl-2 family and candidate mediator of p53-induced apoptosis Science 2881053–1058
[Abstract/Free Full Text] - Smith, GP. 1991Surface presentation of protein epitopes using bacteriophage expression systems Curr. Opin. Biotechnol 2668–673[CrossRef][Medline]
- Tanaka, F, Fuller, R, Barbas, CF, III. 2005Development of small designer aldolase enzymes: catalytic activity, folding, and substrate specificity Biochemistry 447583–7592[CrossRef][Medline]
- Go, M. 1983Modular structural units, exons, and function in chicken lysozyme Proc. Natl. Acad. Sci., U.S.A 801964–1968
[Abstract/Free Full Text] - Frugier, M, Giege, R, Schimmel, P. 2003RNA recognition by designed peptide fusion creates artificial tRNA synthetase Proc. Natl. Acad. Sci., U.S.A 1007471–7475
[Abstract/Free Full Text]
This article has been cited by other articles:
![]() |
H. Saito, S. Kashida, T. Inoue, and K. Shiba The role of peptide motifs in the evolution of a protein network Nucleic Acids Res., October 8, 2007; 35(19): 6357 - 6366. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






