Nucleic Acids Research, 2002, Vol. 30, No. 1 383-384
© 2002 Oxford University Press
InBase: the Intein Database
New England Biolabs Inc., 32 Tozer Road, Beverly, MA 01915, USA
Received August 31, 2001; Accepted September 18, 2001.
| ABSTRACT |
|---|
|
|
|---|
Inteins are self-catalytic protein splicing elements. InBase (http://www.neb.com/neb/inteins.html), the Intein Database and Registry, is a curated compilation of published and unpublished information about protein splicing. It presents general information as well as detailed data for each intein, including tabulated comparisons and a comprehensive bibliography. An intein-specific BLAST server is now available to assist in identifying new inteins.
| INTRODUCTION |
|---|
|
|
|---|
Inteins are in-frame intervening sequences that disrupt a host gene and its protein product; the host gene and its protein product are called exteins (1). Inteins are post-translationally excised from a protein precursor by a self-catalytic protein splicing mechanism (2,3). Consequently, two or more stable proteins [the intein(s) and the extein] are produced from a single gene. Extein ligation forming a native peptide bond and the presence of conserved intein motifs differentiate intein-mediated protein splicing from other post-translational processing events. Most inteins are bifunctional proteins with separate structural domains responsible for protein splicing and homing endonuclease activities. Over 75% of currently registered inteins contain either a DOD or H-N-H class homing endonuclease domain (4). Homing endonucleases initiate mobilization of the intein gene into the same site in a homologous host gene lacking the intervening sequence (5).
There are >115 inteins registered in InBase from Archaea, Bacteria and Eukarya. InBase (http://www.neb.com/neb/inteins.html) compiles information about inteins, often submitted by researchers prior to publication. Several subsets of data are tabulated for easy comparison including organism data, motif sequences, proximal insertion site sequences, selected properties and intein alleles (inteins present at the same insertion site in homologous genes from different species).
| NEW DEVELOPMENTS |
|---|
|
|
|---|
InBase has expanded since the 2000 NAR Database Issue (6). The InBase home page has been reorganized to move background information into a separate section called Intein Basics. This section is suitable for students and contains PDF files that allow users to download protein splicing figures. An intein-specific BLAST server is now available and the amino acid sequence is present in individual intein pages. The Identification of Intein section reflects the growing list of intein polymorphisms. A new splicing mechanism for inteins lacking an N-terminal nucleophile has been included in the Splicing Mechanism section (7).
The InBase BLAST server was added in response to requests from researchers and genome sequencing groups, since intein identification is not always trivial. Searches of general sequence databases often yield hits with very low scores and low probability values due to (i) the small size of mini-inteins (as few as 134 amino acids), (ii) the low level of sequence similarity even in conserved motifs and (iii) the high degree of polymorphisms in conserved splice junction residues. By limiting the search process to intein sequences, significant scores and P-values can be obtained. Since inteins are predominantly found in extein active sites and cofactor or substrate binding pockets, identification of inteins can potentially help locate these elements in uncharacterized proteins.
Several important advances have been reported in the protein splicing field. Most notable was the discovery of a non-canonical protein splicing pathway for inteins beginning with Ala, instead of Ser, Thr or Cys (7). The first crystal structure of an intein precursor shed light onto the amino acids that assist catalysis and the need for conformational changes to align nucleophiles during the sequential steps in the protein splicing pathway (8). Numerous protein engineering applications take advantage of the C-terminal
-thioester formed on target proteins purified from intein vectors (2,3,911), such as the commercially available IMPACTTM system (NEB). Papers using intein vectors for protein purification are not included in the bibliography unless they add to our understanding of intein technologies. A green (APP) label in the bibliography section highlights application papers.
| ORGANIZATION OF THE DATABASE |
|---|
|
|
|---|
Since few textbooks cover protein splicing, InBase provides background material suitable for classroom use. At the same time, detailed information is presented in a layered format with general discussions and tables pointing to more specific data. The InBase home page lists the accessible sections in the database:
1. Intein Basics
2. The Mechanism of Protein Splicing
3. The Intein Registry
4. Intein Motifs
5. Identifying Inteins
A. Conserved Intein Features
B. BLAST the InBase Sequence Database
6. Online Submission of Intein Data
7. The Intein Bibliography
8. Intein Links
The Intein Registry (section 3A) lists all known inteins sorted by Domain of Life, genus and species of the host organism, while section 3B sorts inteins by extein insertion site. Individual intein records contain detailed information about each intein, including insertion site sequence data, comments on unusual properties, submitter contact information and a reference list for each intein. Section 5 describes the criteria for intein identification, including a description of conserved motifs and polymorphisms. Intein data can be submitted confidentially or for immediate release using the online submission form or by email. References throughout the database are linked to the Bibliography section. The bibliography includes annotations for reviews, application papers, related papers and recent papers. PubMed hot links allow the reader to retrieve abstracts from the National Library of Medicine.
| DATABASE AVAILABILITY AND CITATION |
|---|
|
|
|---|
InBase can be found by clicking the Technical Resource button on the New England Biolabs Home Page (http://www.neb.com) or directly at http://www.neb.com/neb/inteins.html. Users of InBase are requested to cite this article when referencing the database.
| ACKNOWLEDGEMENTS |
|---|
I am grateful to my co-workers at NEB, Ellen M. Zaglakas and Ching Lin for help in maintaining InBase, Janos Posfai and Tamas Vincze for developing and maintaining the InBase BLAST server, and to all the intein workers who have submitted their published and unpublished data, especially Shmuel Pietrokovski.
| FOOTNOTES |
|---|
* Tel: +1 978 927 5054; Fax: +1 978 921 1350; Email: perler{at}neb.com
| REFERENCES |
|---|
|
|
|---|
-
1 Perler,F.B., Davis,E.O., Dean,G.E., Gimble,F.S., Jack,W.E., Neff,N., Noren,C.J., Thorner,J. and Belfort,M. (1994) Protein splicing elements: inteins and exteinsa definition of terms and recommended nomenclature. Nucleic Acids Res., 22, 11251127.
2 Noren,C.J., Wang,J. and Perler,F.B. (2000) Dissecting the chemistry of protein splicing and its applications. Angew. Chem. Int. Ed., 39, 450466.
3 Paulus,H. (2001) Inteins as enzymes. Bioorg. Chem., 29, 119129.[Web of Science][Medline]
4 Belfort,M. and Roberts,R.J. (1997) Homing endonucleases: keeping the house in order. Nucleic Acids Res., 25, 33793388.
5 Gimble,F.S. and Thorner,J. (1992) Homing of a DNA endonuclease gene by meiotic gene conversion in Saccharomyces cerevisiae. Nature, 357, 301306.[Medline]
6 Perler,F.B. (2000) InBase, the Intein Database. Nucleic Acids Res., 28, 344345.
7 Southworth,M.W., Benner,J. and Perler,F.B. (2000) An alternative protein splicing mechanism for inteins lacking an N-terminal nucleophile. EMBO J., 19, 50195026.[Web of Science][Medline]
8 Poland,B.W., Xu,M.Q. and Quiocho,F.A. (2000) Structural insights into the protein splicing mechanism of PI-SceI. J. Biol. Chem., 275, 1640816413.
9 Blaschke,U.K., Silberstein,J. and Muir,T.W. (2000) Protein engineering by expressed protein ligation. Methods Enzymol., 328, 478496.[Web of Science][Medline]
10 de Grey,A.D. (2000) Mitochondrial gene therapy: an arena for the biomedical use of inteins. Trends Biotechnol., 18, 394399.[Web of Science][Medline]
11 Perler,F.B. and Adam,E. (2000) Protein splicing and its applications. Curr. Opin. Biotechnol., 11, 377383.[Web of Science][Medline]
This article has been cited by other articles:
![]() |
R. Raghavan and M. F. Minnick Group I Introns and Inteins: Disparate Origins but Convergent Parasitic Strategies J. Bacteriol., October 15, 2009; 191(20): 6193 - 6202. [Full Text] [PDF] |
||||
![]() |
P. Singh, P. Tripathi, G. H. Silva, A. Pingoud, and K. Muniyappa Characterization of Mycobacterium leprae RecA Intein, a LAGLIDADG Homing Endonuclease, Reveals a Unique Mode of DNA Binding, Helical Distortion, and Cleavage Compared with a Canonical LAGLIDADG Homing Endonuclease J. Biol. Chem., September 18, 2009; 284(38): 25912 - 25928. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Lockless and T. W. Muir Traceless protein splicing utilizing evolved split inteins PNAS, July 7, 2009; 106(27): 10999 - 11004. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Derrien, W. Majeran, F.-A. Wollman, and O. Vallon Multistep Processing of an Insertion Sequence in an Essential Subunit of the Chloroplast ClpP Complex J. Biol. Chem., June 5, 2009; 284(23): 15408 - 15415. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. V. Ravin, A. V. Mardanov, A. V. Beletsky, I. V. Kublanov, T. V. Kolganova, A. V. Lebedinsky, N. A. Chernyh, E. A. Bonch-Osmolovskaya, and K. G. Skryabin Complete Genome Sequence of the Anaerobic, Protein-Degrading Hyperthermophilic Crenarchaeon Desulfurococcus kamchatkensis J. Bacteriol., April 1, 2009; 191(7): 2371 - 2379. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. Appleby, K. Zhou, G. Volkmann, and X.-Q. Liu Novel Split Intein for trans-Splicing Synthetic Peptide onto C Terminus of Protein J. Biol. Chem., March 6, 2009; 284(10): 6194 - 6199. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Raghavan, L. D. Hicks, and M. F. Minnick Toxic Introns and Parasitic Intein in Coxiella burnetii: Legacies of a Promiscuous Past J. Bacteriol., September 1, 2008; 190(17): 5934 - 5943. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-Q. Dun, X.-J. Wang, W. Lu, Z.-L. Zhao, S.-N. Hou, B.-M. Zhang, G.-Y. Li, T. C. Evans Jr., M.-Q. Xu, and M. Lin Reconstitution of Glyphosate Resistance from a Split 5-Enolpyruvyl Shikimate-3-Phosphate Synthase Gene in Escherichia coli and Transgenic Tobacco Appl. Envir. Microbiol., December 15, 2007; 73(24): 7997 - 8000. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. V. Patel, K. A. Vyas, R. L. Mattoo, M. Southworth, F. B. Perler, D. Comb, and S. Roseman Properties of the C-terminal Domain of Enzyme I of the Escherichia coli Phosphotransferase System J. Biol. Chem., June 30, 2006; 281(26): 17579 - 17587. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Levitin, O. Stern, M. Weiss, C. Gil-Henn, R. Ziv, Z. Prokocimer, N. I. Smorodinsky, D. B. Rubinstein, and D. H. Wreschner The MUC1 SEA Module Is a Self-cleaving Domain J. Biol. Chem., September 30, 2005; 280(39): 33374 - 33386. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Eichler and M. W. W. Adams Posttranslational Protein Modification in Archaea Microbiol. Mol. Biol. Rev., September 1, 2005; 69(3): 393 - 425. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Fukui, H. Atomi, T. Kanai, R. Matsumi, S. Fujiwara, and T. Imanaka Complete genome sequence of the hyperthermophilic archaeon Thermococcus kodakaraensis KOD1 and comparison with Pyrococcus genomes Genome Res., March 1, 2005; 15(3): 352 - 363. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. L. Hendrickson, R. Kaul, Y. Zhou, D. Bovee, P. Chapman, J. Chung, E. Conway de Macario, J. A. Dodsworth, W. Gillett, D. E. Graham, et al. Complete Genome Sequence of the Genetically Tractable Hydrogenotrophic Methanogen Methanococcus maripaludis J. Bacteriol., October 15, 2004; 186(20): 6956 - 6969. [Abstract] [Full Text] [PDF] |
||||
![]() |
X.-Q. Liu and J. Yang Bacterial Thymidylate Synthase with Intein, Group II Intron, and Distinctive ThyX Motifs J. Bacteriol., September 15, 2004; 186(18): 6316 - 6319. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Sun, J. Yang, and X.-Q. Liu Synthetic Two-piece and Three-piece Split Inteins for Protein trans-Splicing J. Biol. Chem., August 20, 2004; 279(34): 35281 - 35286. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. V. Mills, J. S. Manning, A. M. Garcia, and L. A. Wuerdeman Protein Splicing of a Pyrococcus abyssi Intein with a C-terminal Glutamine J. Biol. Chem., May 14, 2004; 279(20): 20685 - 20691. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. K. O. Cann, K. R. Amaya, M. W. Southworth, and F. B. Perler Bacteriophage-Based Genetic System for Selection of Nonsplicing Inteins Appl. Envir. Microbiol., May 1, 2004; 70(5): 3158 - 3162. [Abstract] [Full Text] [PDF] |
||||
![]() |
X.-Q. Liu, J. Yang, and Q. Meng Four Inteins and Three Group II Introns Encoded in a Bacterial Ribonucleotide Reductase Gene J. Biol. Chem., November 21, 2003; 278(47): 46826 - 46831. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Waters, M. J. Hohn, I. Ahel, D. E. Graham, M. D. Adams, M. Barnstead, K. Y. Beeson, L. Bibbs, R. Bolanos, M. Keller, et al. The genome of Nanoarchaeum equitans: Insights into early archaeal evolution and derived parasitism PNAS, October 28, 2003; 100(22): 12984 - 12988. [Abstract] [Full Text] [PDF] |
||||
![]() |
X.-Q. Liu and J. Yang Split dnaE Genes Encoding Multiple Novel Inteins in Trichodesmium erythraeum J. Biol. Chem., July 11, 2003; 278(29): 26315 - 26318. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. G. Chin, G.-D. Kim, I. Marin, F. Mersha, T. C. Evans Jr., L. Chen, M.-Q. Xu, and S. Pradhan Protein trans-splicing in transgenic plant chloroplast: Reconstruction of herbicide resistance from split genes PNAS, April 15, 2003; 100(8): 4510 - 4515. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Yang, G. C. Fox Jr., and T. V. Henry-Smith Intein-mediated assembly of a functional beta -glucuronidase in transgenic plants PNAS, March 18, 2003; 100(6): 3513 - 3518. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. W. Southworth and F. B. Perler Protein Splicing of the Deinococcus radiodurans Strain R1 Snf2 Intein J. Bacteriol., November 15, 2002; 184(22): 6387 - 6388. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Landthaler, U. Begley, N. C. Lau, and D. A. Shub Two self-splicing group I introns in the ribonucleotide reductase large subunit gene of Staphylococcus aureus phage Twort Nucleic Acids Res., May 1, 2002; 30(9): 1935 - 1943. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






