Nucleic Acids Research, 2000, Vol. 28, No. 1 185-190
© 2000 Oxford University Press
EID: the ExonIntron Databasean exhaustive database of protein-coding intron-containing genes
Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
To aid studies of molecular evolution and to assist in gene prediction research, we have constructed an ExonIntron Database (EID) in FASTA format. Currently, the database is derived from GenBank release 112, and it contains 51 289 protein-coding genes (287 209 exons) that harbor introns, along with extensive descriptions of each gene and its DNA and protein sequences, as well as splice motif information. There is 17% redundancy inherited from GenBanka purge at the 99% identity level reduced the database to 42 460 genes (243 589 exons). We have created subdatabases of genes whose intron positions have been experimentally determined. One such database, constructed by comparing genomic and mRNA sequences, contains 11 242 genes (62 474 exons). A larger database of 22 196 genes (105 595 exons) was constructed by selecting on keywords to eliminate computer-predicted genes. By examining the two nucleotides adjacent to the intron boundary, we infer that there is a 2% rate of errors or other deviations from the standard GT...AG motif in nuclear genes. This criterion can be used to eliminate 4921 genes from the overall database. Various tools are provided to enable generation of user-specific subsets of the EID. The EID distribution can be obtained from http://mcb.harvard.edu/gilbert/EID
* To whom correspondence should be addressed. Tel: +1 617 495 0760; Fax: +1 617 496 4313; Email: gilbert@nucleus.harvard.edu
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. Zeng, S. Zhu, and H. Yan Towards accurate human promoter recognition: a review of currently used sequence features and classification methods Brief Bioinform, September 1, 2009; 10(5): 498 - 508. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Schwartz, J. Silva, D. Burstein, T. Pupko, E. Eyras, and G. Ast Large-scale comparative analysis of splicing signals and their corresponding splicing factors in eukaryotes Genome Res., January 1, 2008; 18(1): 88 - 103. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Shepelev and A. Fedorov Advances in the Exon-Intron Database (EID) Brief Bioinform, June 1, 2006; 7(2): 178 - 185. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Shao, V. Shepelev, and A. Fedorov Bioinformatic analysis of exon repetition, exon scrambling and trans-splicing in humans Bioinformatics, March 15, 2006; 22(6): 692 - 698. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Fedorov, J. Stombaugh, M. W. Harr, S. Yu, L. Nasalean, and V. Shepelev Computer identification of snoRNA genes using a Mammalian Orthologous Intron Database Nucleic Acids Res., August 10, 2005; 33(14): 4578 - 4583. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Burnette, E. Miyamoto-Sato, M. A. Schaub, J. Conklin, and A. J. Lopez Subdivision of Large Introns in Drosophila by Recursive Splicing at Nonexonic Elements Genetics, June 1, 2005; 170(2): 661 - 674. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. M. Kupfer, S. D. Drabenstot, K. L. Buchanan, H. Lai, H. Zhu, D. W. Dyer, B. A. Roe, and J. W. Murphy Introns and Splicing Elements of Five Diverse Fungi Eukaryot. Cell, October 1, 2004; 3(5): 1088 - 1100. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. T. Eskesen, F. N. Eskesen, and A. Ruvinsky Natural Selection Affects Frequencies of AG and GT Dinucleotides at the 5' and 3' Ends of Exons Genetics, May 1, 2004; 167(1): 543 - 550. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Gopalan, T. W. Tan, B. T. K. Lee, and S. Ranganathan Xpro: database of eukaryotic protein-encoding genes Nucleic Acids Res., January 1, 2004; 32(90001): D59 - 63. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. H-F. Zhang, K. A. Heller, I. Hefter, C. S. Leslie, and L. A. Chasin Sequence Information for the Splicing of Human Pre-mRNA Identified by Support Vector Machine Classification Genome Res., December 1, 2003; 13(12): 2637 - 2650. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. D. Drabenstot, D. M. Kupfer, J. D. White, D. W. Dyer, B. A. Roe, K. L. Buchanan, and J. W. Murphy FELINES: a utility for extracting and examining EST-defined introns and exons Nucleic Acids Res., November 15, 2003; 31(22): e141 - e141. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Zhang and L. Luo Splice site prediction with quadratic discriminant analysis using diversity measure Nucleic Acids Res., November 1, 2003; 31(21): 6214 - 6220. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Fedorov, S. Roy, L. Fedorova, and W. Gilbert Mystery of Intron Gain Genome Res., October 1, 2003; 13(10): 2236 - 2241. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Roy, A. Fedorov, and W. Gilbert Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain PNAS, June 10, 2003; 100(12): 7158 - 7162. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. ZHANG, J. LEE, and L. A. CHASIN The effect of nonsense codons on splicing: A genomic analysis RNA, June 1, 2003; 9(6): 637 - 639. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Fedorov, A. F. Merican, and W. Gilbert Large-scale comparison of intron positions among animal, plant, and fungal genes PNAS, December 10, 2002; 99(25): 16128 - 16133. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Endo, A. Fedorov, S. J. de Souza, and W. Gilbert Do Introns Favor or Avoid Regions of Amino Acid Conservation? Mol. Biol. Evol., April 1, 2002; 19(4): 521 - 252. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Fedorov, S. Saxonov, and W. Gilbert Regularities of context-dependent codon bias in eukaryotic genes Nucleic Acids Res., March 1, 2002; 30(5): 1192 - 1197. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Sakharkar, F. Passetti, J. E. de Souza, M. Long, and S. J. de Souza ExInt: an Exon Intron Database Nucleic Acids Res., January 1, 2002; 30(1): 191 - 194. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Fedorov, X. Cao, S. Saxonov, S. J. de Souza, S. W. Roy, and W. Gilbert Intron distribution difference for 276 ancient and 131 modern genes suggests the existence of ancient introns PNAS, October 25, 2001; (2001) 231491498. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. A. Graef, J. M. Gastier, U. Francke, and G. R. Crabtree Evolutionary relationships among Rel domains indicate functional diversification by recombination PNAS, May 8, 2001; 98(10): 5740 - 5745. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Fedorov, S. Saxonov, L. Fedorova, and I. Daizadeh Comparison of intron-containing and intron-lacking human genes elucidates putative exonic splicing enhancers Nucleic Acids Res., April 1, 2001; 29(7): 1464 - 1469. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Pertea, X. Lin, and S. L. Salzberg GeneSplicer: a new computational method for splice site prediction Nucleic Acids Res., March 1, 2001; 29(5): 1185 - 1190. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. S. Kohane Bioinformatics and Clinical Informatics: The Imperative to Collaborate J. Am. Med. Inform. Assoc., September 1, 2000; 7(5): 512 - 516. [Full Text] |
||||
![]() |
A. Fedorov, X. Cao, S. Saxonov, S. J. de Souza, S. W. Roy, and W. Gilbert Intron distribution difference for 276 ancient and 131 modern genes suggests the existence of ancient introns PNAS, November 6, 2001; 98(23): 13177 - 13182. [Abstract] [Full Text] [PDF] |
||||









