Skip Navigation

This Article
Right arrow Print PDF (438K)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (40)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Brunak, S.
Right arrow Articles by Knudsen, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Brunak, S.
Right arrow Articles by Knudsen, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 1990, Vol. 18, No. 16 4797-4801
© 1990


MOLECULAR BIOLOGY

Neural network detects errors in the assignment of mRNA splice sites

Søfren Brunak, Jacob Engelbrecht1 and Steen Knudsen2,*

Department of Structural Properties of Materials The Technical University of Denmark, DK-2800 Lyngby 1Department of Dairy & Food Science, Royal Veterinary and Agricultural University Bülowsvej 13, DK-1870 Frederiksberg C 2University Institute of Microbiology, Øster Farimagsgade 2A DK-1353 Copenhagen C, Denmark

Received April 24, 1990. Revised July 25, 1990. Accepted July 25, 1990.

The use of databanks in genetic research assumes reliability of the information they contain. Currently, error-detection in the manually or electronically entered data contained in the nucleotide sequence databanks at EMBL, Heidelberg and GenBank at Los Alamos is limited. We have used a subset of sequences from these databanks to train neural networks to recognize pre-mRNA splicing signals in human genes. During the training on 33 human genes from the EMBL databank seven genes appeared to disturb the learning process. Subsequent investigation revealed discrepancies from the original published papers, for three genes. In four genes, we found wrongly assigned splicing frames of introns. We believe this to be a reflection of the fact that splicing frames cannot always be unambiguously assigned on the basis of experimental data. Thus incorrect assignment appear both due to mere typographical misprints as well as erroneous interpretation of experiments. Training on 241 human sequences from GenBank revealed nine new errors. We propose that such errors could be detected by computer algorithms designed to check the consistency of data prior to their incorporation in databanks.


*Present address: Molecular Biology Computer Research Resource, Dana-Farber Cancer Institute, Harvard School of Public Health, 44 Binney St, Boston, MA 02115, USA


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Protein Eng Des SelHome page
H. Nielsen, S. Brunak, and G. von Heijne
Machine learning approaches for the prediction of signal peptides and other protein sorting signals
Protein Eng. Des. Sel., January 1, 1999; 12(1): 3 - 9.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
J. Weinstein, K. Kohn, M. Grever, V. Viswanadhan, L. Rubinstein, A. Monks, D. Scudiero, L Welch, A. Koutsoukos, A. Chiausa, et al.
Neural computing in cancer drug development: predicting mechanism of action
Science, October 16, 1992; 258(5081): 447 - 451.
[Abstract] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.