Nucleic Acids Research Advance Access originally published online on June 13, 2007
Nucleic Acids Research 2007 35(12):4195-4202; doi:10.1093/nar/gkm338
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2007, Vol. 35, No. 12 4195-4202
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Computational Biology |
A statistical learning approach to the modeling of chromatographic retention of oligonucleotides incorporating sequence and secondary structure data
1Simulation of Biological Systems, Eberhard Karls University Tübingen and 2Department of Chemistry, Instrumental Analysis and Bioanalysis, Saarland University, Saarbrücken, Germany
*To whom correspondence should be addressed. Tel: 0049 7071 29 70462; Fax: 0049 7071 29 5152; Email: sturm{at}informatik.uni-tuebingen.de
Received March 7, 2007. Revised April 17, 2007. Accepted April 19, 2007.
We propose a new model for predicting the retention time of oligonucleotides. The model is based on
support vector regression using features derived from base sequence and predicted secondary structure of oligonucleotides. Because of the secondary structure information, the model is applicable even at relatively low temperatures where the secondary structure is not suppressed by thermal denaturing. This makes the prediction of oligonucleotide retention time for arbitrary temperatures possible, provided that the target temperature lies within the temperature range of the training data.
We describe different possibilities of feature calculation from base sequence and secondary structure, present the results and compare our model to existing models.