Nucleic Acids Research Advance Access originally published online on May 25, 2007
Nucleic Acids Research 2007 35(Web Server issue):W305-W309; doi:10.1093/nar/gkm255
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2007, Vol. 35, No. suppl_2 W305-W309
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Articles |
RNAbor: a web server for RNA structural neighbors
1Linnaeus Centre for Bioinformatics, Uppsala University, 75124 Uppsala, Sweden, 2School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK and 3Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
*To whom correspondence should be addressed. Tel: + 1 617 552 1332; Fax: + 1 617 552 2011; Email: clote{at}bc.edu
Received January 24, 2007. Revised March 26, 2007. Accepted April 8, 2007.
| ABSTRACT |
|---|
|
|
|---|
RNAbor provides a new tool for researchers in the biological and related sciences to explore important aspects of RNA secondary structure and folding pathways. RNAbor computes statistics concerning
-neighbors of a given input RNA sequence and structure (the structure can, for example, be the minimum free energy (MFE) structure). A
-neighbor is a structure that differs from the input structure by exactly
base pairs, that is, it can be obtained from the input structure by adding and/or removing exactly
base pairs. For each distance
RNAbor computes the density of
-neighbors, the number of
-neighbors, and the MFE structure, or MFE
structure, among all
-neighbors. RNAbor can be used to study possible folding pathways, to determine alternate low-energy structures, to predict potential nucleation sites and to explore structural neighbors of an intermediate, biologically active structure. The web server is available at http://bioinformatics.bc.edu/clotelab/RNAbor. | INTRODUCTION |
|---|
|
|
|---|
RNA plays a surprising and previously unsuspected role in many biological processes, such as post-transcriptional regulation, conformational switches, expansion of the genetic code (such as selenocysteine insertion), ribosomal frameshift, metabolite-binding and chemical modification of specific nucleotides in the ribosome. Apart from its catalytic role as a ribonucleic enzyme (ribozyme) (1), RNA can regulate genes in several ways. For example, by hybridizing to a portion of messenger RNA, small
22 nt RNA molecules perform post-transcriptional gene regulation by RNA interference (RNAi), a process so important that for its discovery the 2006 Nobel Prize in Physiology or Medicine was awarded to A. Z. Fire and C. C. Mello. In addition, by very different means, RNA can perform transcriptional and translational gene regulation by allostery, where a portion of the 5' untranslated region (5' UTR) of mRNA, known as a riboswitch (2,3), undergoes a conformational change upon binding a specific ligand such as adenine, guanine or lysine. As the field of RNomics matures, many sophisticated computational tools, e.g. RNA structure prediction, alignment and gene finding, have been developedsee (4,5) for recent overviews. Recently developed programs that are of most relevance here include the program Sfold (6,7), that computes a low energy ensemble of structures by sampling from the partition function (8), and an earlier program RNAsubopt (9) that computes all suboptimal structures within a user-specified number of kcal/mol of the minimum free energy (MFE). In addition, the program RNAshapes (1012) provides a useful description of RNA branching structure by computing the Boltzmann probability of various shapes and also the MFE structure for various shapes. Here, an RNA shape is an equivalence class of secondary structures, describing the overall branching; for instance the shape of a typical cloverleaf tRNA would be [ [ ] [ ] [ ] ].
In this article, we describe the web server RNAbor, which computes the Boltzmann probability and MFE structures which differ by
base pairs from a given initial structure. Unlike most of the tools just described, which focus on the MFE structure or a low energy ensemble, RNAbor yields information concerning the secondary structure folding landscape. Potential applications of RNAbor include the design of RNA aptamers (see (13) for a suggestion how RNA might be designed to inhibit the function of the viral enzymes such as HIV-1 reverse transcriptase and hepatitis C NS3 protease), detection of conformational switches, understanding the role played by biologically active structural intermediates and improvement in secondary structure prediction.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Let
, a secondary structure
-neighbor of
base pairs [14]. In (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication), we describe new algorithms, which compute the number N
of
-neighbors, the partition function Z
for
-neighbors and the MFE
, and the corresponding MFE
structure over all
-neighbors of a fixed structure
Computing structural neighbors
To give the reader a feeling for how the algorithms work, we present the recurrence relations to compute the number N
of
-neighbors of
. Let
. If
denotes the number of
-neighbors of the substructure S [i,j], the restriction of
to interval [i,j] of
, then the number of
-neighbors of
,
, can be computed by the following recursion:
| (1) |
This approach for computing N
can be extended to compute the partition function contribution, Z
, of the set of
-neighbors and also to compute the MFE
and the MFE
structure. Computations are made with respect to the Turner energy model (15,16); treatment of the dangle is similar to that in Vienna RNA Package (option -d2). The algorithms employ dynamic programming, and run in O(
· n3) time and O(
· n2) space, where n is the sequence length and
is the maximum value of
. Since
can be at most n, the run time cannot be worse than O(n4) and space no worse that O(n3), even if the user does not specify a value of
. Full details of the algorithms are given in (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication).
Web server
The web server available at http://bioinformatics.bc.edu/clotelab/RNAbor runs on a Linux cluster with 20 computational nodes, each with double processors of between 1300 and 3000 MHz and 2 GB RAM (6 Dell PowerEdge 1650, 2 x 1300 MHz Pentium III, 2 GB RAM; 11 Dell PowerEdge 1850, 2 x 2800 + MHz Xeon EM64T, 2 GB RAM; 5 Dell PowerEdge 1850, 2 x 3000 MHz Xeon EM64T, 2 GB RAM).
| RESULTS |
|---|
|
|
|---|
Due to the time and space constraints of the algorithm, RNA sequences may be of length up to 300 nucleotides. Sequences of length up to 60 are processed interactively and output is displayed in the user's browser window. For sequences of length 61300, the computation is done off-line and the results are returned to the user by email; for this, the email address is required. The user can either paste an input sequence (with optional secondary structure), or upload a file of the same. The full input consists of up to four lines, illustrated by the following example.
The temperature is set to a default value of 37
C; however the user can enter any integer temperature between 0 and 100.
The only required input is an RNA sequence
of length at most 300 nucleotides; the FASTA comment, initial secondary structure
and upper bound
are optional inputs. If no secondary structure is given, then the initial structure
is taken to be the MFE structure, as computed by RNAfold -d2. If the optional input
is missing, then
is defined to equal the length n of the input sequence
; otherwise
is the minimum of the input value and n. For each 0
, RNAbor computes the Boltzmann probability p
= Z
/Z, where the partition function is defined by
|
|
-neighbors of
Z
is computed by McCaskill's algorithm (8) if
n.
In addition to computing probability p
, RNAbor computes the number N
of
-neighbors of
, the MFE
over all
-neighbors of
and the MFE
secondary structure. Tables of the values N
and p
, as well as their graphs, are made available as downloadable files. The five-column text file output, consisting of
, p
, N
, MFE
and the MFE
structure, is depicted in Figure 1.
|
| EXAMPLES |
|---|
|
|
|---|
RNAbor can be used to generate alternative low energy structures, which differ markedly from the MFE structure, or from any initially given structure. Figure 1 shows the RNAbor output for a short 3 '-UTR sequence of an mRNA with NCBI accession number MUSGBPS. The input structure in this example is the MFE structure (as predicted by RNAfold -d2). The RNAbor output indicates two ranges of
that show higher probabilities than the rest, 09 and 2024. The MFE
structures at distance
between 0 and 9 from the MFE structure all have very similar folds and the probability of finding the RNA in a structure at
between 0 and 9 is 0.63. The probability of finding a structure at
2024 is also relatively high, 0.35, and the MFE
structures in this range are similar to each other but completely different from the MFE structure. Thus the two highly probable
ranges represent two possible alternative folds of the RNA.
Analyzing the same sequence with Sfold gives similar results. Sfold finds three types of structures (three clusters), with probabilities 0.65, 0.22 and 0.13, respectively. One cluster contains the MFE structure corresponding to the folds at
values from 0 to 9, another cluster has a centroid structure resembling the structures at
between 20 and 24, and the third cluster has a centroid structure similar to the MFE19 structure. RNAshapes on the other hand is less successful for this example since the alternative folds as predicted by RNAbor have the same shape [ ], even though the folds are very different.
Figure 2 displays the MFE structure and the MFE30 structure of the 101 nt SAM riboswitch with EMBL accession number AP004597.1/118941-119041, with sequence taken from Rfam (17). The MFE structure over all 30-neighbors, the MFE30 structure, is clearly much closer to the real structure than the global MFE structure. Figure 3 displays the Boltzmann probability density, showing a peak for the value
= 30.
|
|
| DISCUSSION |
|---|
|
|
|---|
In this article, we have introduced the web server RNAbor, which computes the Boltzmann probability and MFE structure over all
-neighbors for a given RNA sequence
· n3) and space O(
· n2) resources. Figures 2 and 3 illustrate the use of RNAbor in better understanding structural aspects of a SAM riboswitch, and indicate that RNAbor should provide a useful complementary tool to programs such as Sfold and RNAshapes for analyzing the ensemble of possible secondary structures on a given RNA sequence.
| ACKNOWLEDGEMENTS |
|---|
Research of P.C. was partially supported by National Science Foundation DBI-0543506, which additionally supported some travel of E.F. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. All three authors would like to thank Elena Rivas, Eric Westhof and funding agencies for organizing the meeting RNA-2006 in Benasque, Spain, in July 2006, where some of this work was carried out. Finally, thanks to Jason Persampieri for some technical assistance. Funding to pay the Open Access publication charges for this article was provided by the National Science Foundation.
Conflict of interest statement. None declared.
| REFERENCES |
|---|
|
|
|---|
- Doudna JA, Cech TR. The chemical repertoire of natural ribozymes. Nature (2002) 418:222228.[CrossRef][Medline]
- Winkler WC, Cohen-Chalamish S, Breaker RR. An mRNA structure that controls gene expression by binding FMN. Proc. Natl Acad. Sci. USA (2002) 99:1590815913.
[Abstract/Free Full Text] - Penchovsky R, Breaker RR. Computational design and experimental validation of oligonucleotide-sensing allosteric ribozymes. Nat. Biotechnol (2005) 23:14241431.[CrossRef][Web of Science][Medline]
- Mathews DH, Turner DH. Prediction of RNA secondary structure by free energy minimization. Curr. Opin. Struct. Biol (2006) 16:270278.[CrossRef][Web of Science][Medline]
- Eddy SR. Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet (2001) 2:919929.[CrossRef][Web of Science][Medline]
- Ding Y, Lawrence CE. A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res (2003) 31:72807301.
[Abstract/Free Full Text] - Ding Y, Chan CY, Lawrence CE. RNA secondary structure prediction by centroids in a Boltzmann weighted ensemble. RNA (2005) 11:11571166.
[Abstract/Free Full Text] - McCaskill JS. The equilibrium partition function and base pair binding probabilities for RNA secondary structures. Biopolymers (1990) 29:11051119.[CrossRef][Web of Science][Medline]
- Wuchty S, Fontana W, Hofacker IL, Schuster P. Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers (1999) 49:145164.[CrossRef][Web of Science][Medline]
- Giegerich R, Voss B, Rehmsmeier M. Abstract shapes of RNA. Nucleic Acids Res (2004) 32:48434851.
[Abstract/Free Full Text] - Steffen P, Voss B, Rehmsmeier M, Reeder J, Giegerich R. RNAshapes: an integrated RNA analysis package based on abstract shapes. Bioinformatics (2006) 22:500503.
[Abstract/Free Full Text] - Voss B, Giegerich R, Rehmsmeier M. Complete probabilistic analysis of RNA shapes. BMC Biol (2006) 4:5.[CrossRef][Medline]
- James W. Nucleic acid and polypeptide aptamers: a powerful approach to ligand discovery. Curr. Opin. Pharmacol (2001) 1:540548.[CrossRef][Medline]
- Moulton V, Zuker M, Steel M, Pointon R, Penny D. Metrics on RNA secondary structures. J. Comput. Biol (2000) 7:277292.[CrossRef][Web of Science][Medline]
- Matthews DH, Sabina J, Zuker M, Turner DH. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol (1999) 288:911940.[CrossRef][Web of Science][Medline]
- Xia T, SantaLucia J Jr, Burkard ME, Kierzek R, Schroeder SJ, Jiao X, Cox C, Turner DH. Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. Biochemistry (1998) 37:1471914735.[CrossRef][Medline]
- Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res (2003) 31:439441.
[Abstract/Free Full Text]
This article has been cited by other articles:
![]() |
J. Waldispuhl, S. Devadas, B. Berger, and P. Clote RNAmutants: a web server to explore the mutational landscape of RNA secondary structures Nucleic Acids Res., July 1, 2009; 37(suppl_2): W281 - W286. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



