Nucleic Acids Research Advance Access originally published online on March 2, 2009
Nucleic Acids Research 2009 37(8):2461-2470; doi:10.1093/nar/gkp093
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2009, Vol. 37, No. 8 2461-2470
© 2009 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
RNA |
Meta-analysis of small RNA-sequencing errors reveals ubiquitous post-transcriptional RNA modifications
1Department of Biochemistry, University of Alberta, Edmonton, AB, T6G 2H7, 2School of Computing Science, Simon Fraser University, Surrey, BC, V3T 0A3 and 3Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada
*To whom correspondence should be addressed. Tel: +1 780 492 2410; Fax: +1 780 492 0886; Email: ebhardt{at}ualberta.ca
Received January 6, 2009. Revised February 5, 2009. Accepted February 5, 2009.
Recent advances in DNA-sequencing technology have made it possible to obtain large datasets of small RNA sequences. Here we demonstrate that not all non-perfectly matched small RNA sequences are simple technological sequencing errors, but many hold valuable biological information. Analysis of three small RNA datasets originating from Oryza sativa and Arabidopsis thaliana small RNA-sequencing projects demonstrates that many single nucleotide substitution errors overlap when aligning homologous non-identical small RNA sequences. Investigating the sites and identities of substitution errors reveal that many potentially originate as a result of post-transcriptional modifications or RNA editing. Modifications include N1-methyl modified purine nucleotides in tRNA, potential deamination or base substitutions in micro RNAs, 3' micro RNA uridine extensions and 5' micro RNA deletions. Additionally, further analysis of large sequencing datasets reveal that the combined effects of 5' deletions and 3' uridine extensions can alter the specificity by which micro RNAs associate with different Argonaute proteins. Hence, we demonstrate that not all sequencing errors in small RNA datasets are technical artifacts, but that these actually often reveal valuable biological insights to the sites of post-transcriptional RNA modifications.