Nucleic Acids Research Advance Access published online on June 13, 2007
Nucleic Acids Research, doi:10.1093/nar/gkm338
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Computational Biology |
A statistical learning approach to the modeling of chromatographic retention of oligonucleotides incorporating sequence and secondary structure data
1Simulation of Biological Systems, Eberhard Karls University Tübingen and 2Department of Chemistry, Instrumental Analysis and Bioanalysis, Saarland University, Saarbrücken, Germany
*To whom correspondence should be addressed. Tel: 0049 7071 29 70462; Fax: 0049 7071 29 5152; Email: sturm{at}informatik.uni-tuebingen.de
Received March 7, 2007. Revised April 17, 2007. Accepted April 19, 2007.
We propose a new model for predicting the retention time of oligonucleotides. The model is based on
support vector regression using features derived from base sequence and predicted secondary structure of oligonucleotides. Because of the secondary structure information, the model is applicable even at relatively low temperatures where the secondary structure is not suppressed by thermal denaturing. This makes the prediction of oligonucleotide retention time for arbitrary temperatures possible, provided that the target temperature lies within the temperature range of the training data.
We describe different possibilities of feature calculation from base sequence and secondary structure, present the results and compare our model to existing models.