| Nucleic Acids Research | Pages |
Modelling the secondary structures of slippage-prone hypervariable RNA regions: the example of the tiger beetle 18S rRNA variable region V4
Introduction
Materials And Methods
DNA sequences and sequence alignment
Secondary structure modelling
Analysis of mutational patterns
Statistical analysis of changes
Results
Analysis of full compensatory mutations
Statistical analysis of patterns of nucleotide change
Discussion
Secondary structure model
Statistical analysis
Acknowledgements
References
Modelling the secondary structures of slippage-prone hypervariable RNA regions: the example of the tiger beetle 18S rRNA variable region V4
ABSTRACT
INTRODUCTION
One of the great successes of modern theoretical biology has been the use of base pair covariation in the modelling of RNA secondary structure (1). The paradigm of covariation analysis is that there must be a strict relationship between bases that pair with one another within an RNA molecule, so that base pairing between the equivalent sites in a multiple alignment of sequences is always conserved. If a base at one of the pairing positions undergoes a transversion, for example from G to C, a corresponding transversion from C to G must take place to preserve the interaction. If a large enough sample of sequences is available, it will be possible to distinguish chance concerted base changes from true covariation, and the secondary structure of a molecule will emerge. Under this model, only positive replacement of a base pair, not conservation of sequence without change, is treated as informative. Because of this, the method requires a set of sequences that are sufficiently evolutionarily distant from one another to include numerous base changes at all positions within a structure. This approach has produced models of large- and small-subunit ribosomal RNAs consistent with experimental probing of the native conformation (1) and is generally regarded as being a more successful approach to modelling RNA secondary structure than the alternative method of predicting RNA secondary structure on the basis of energy minimization.
A large set of unambiguously aligned sequences is essential for successful modelling of RNA structure by covariation. However, some kinds of study may generate datasets for which either too few sequences are available for covariation to be established unambiguously or for which alignment is ambiguous, for example if sequences have undergone length variation during evolution as a result of the action of replication slippage. It may nevertheless be important to define a secondary structure model of the region under study, for example as an aid to alignment for phylogenetic analysis (2), investigating probabilities of nucleotide substitutions (3) or if the aim is to investigate the secondary structure itself either with respect to its functional role in experimental systems or from an evolutionary viewpoint. This could apply to any rapidly evolving nucleic acid sequence that adopts a secondary structure [for example the product of the mammalian X-inactivation gene Xist (4), mitochondrial control regions (5) and viral genomes (6)], but it is likely to be a particular problem in eukaryotic ribosomal RNAs, which in many cases contain long regions (known as expansion segments or variable regions) that appear to have evolved by slippage, are not unambiguously homologous between species (7-9), but which have been used in phylogenetic reconstruction because their rapid evolution provides high resolution (9).
Here, we evaluate an approach to modelling secondary structures of length-variable regions in a phylogenetically limited dataset which does not rely on a pre-existing unambiguous alignment. We do this by making use of the secondary structure prediction program MFOLD (10-13) to identify elements of secondary structure that occupy homologous regions in a set of variable region sequences. Potential structures able to form in a majority of sequences are analysed for base covariation in the normal way, i.e. by identifying compensatory changes within the sequence, and by using a statistical approach to analyse semi-compensatory mutations [defined as mutations that transform a full Watson-Crick base pair (A-U or G-C) into a wobble pair (G.U), or vice versa, by a single point mutation]. This allows for occasional non-pairing combinations of bases without setting pre-determined limits for their frequency. We examine the procedure's utility on a set of sequences of variable region V4 from the small subunit ribosomal RNAs (SSU rRNAs) of tiger beetles (Coleoptera: Cicindelidae) and relatives (9). This variable region has the advantage that it contains regions of high sequence conservation, for which a detailed secondary structure model has been proposed (14,15), interspersed with highly variable regions. This allows us to test the ability of our method to recover well-tested elements of secondary structure while at the same time investigating the more variable parts of V4 for the presence of potentially conserved structures, such as a long variable stem that we have suggested previously might form in the variable central region of V4 (9).
MATERIALS AND METHODS
DNA sequences and sequence alignment
The DNA sequences used for structure modelling are those obtained previously for taxonomic purposes (9) plus sequences for Drosophila melanogaster (16) and Tenebrio molitor (17). The alignment used is essentially identical to figure 1 of ref. 9 except in the region of helix IX, where it has been modified to be consistent with conservation of this stem in other species. The alignment contains 32 sequences and is 526 characters long.
Secondary structure modelling
Preliminary modelling by energy minimization was carried out using the MFOLD program of Zuker and Jaeger (10-13) running under version 8.1 of the GCG package (18) on the MRC Human Genome Mapping Project computer, Cambridge, UK.
Complete variable region sequences were passed through the program and the n structures falling within a window of stability determined automatically by the program were plotted out using the GCG program PLOTFOLD. Parameters used for PLOTFOLD analysis were: maximum size and lopsidedness of internal loop, 30; energy increment, 2.0; window size, 3. Optimal and sub-optimal structures were generated for 27 of the 32 species. Three sequences (from Cicindela repanda, Megacephala klugi and Neocollyris sp.) could not be modelled as their V4 sequences were incomplete, although they could be incorporated into analyses of compensatory mutation (see below).
The individual structural elements found in these global structures were identified and compiled in separate alignments for each structure to confirm their homology and structural similarity. The number of species in which each structure element appeared was determined and the most common elements (i.e. those found in most species) subjected to covariation analysis.
Figure 1. Numbers of stem structures appearing in MFOLD analyses in different numbers of species. Vertical axis, frequency; horizontal axis, number of species.
Analysis of mutational patterns
Because we have a phylogenetic tree for these species (19; Fig. 2), we were able to map base changes seen within putative secondary structures onto the tree and, in most cases, to ascribe directionality to them. We defined nine kinds of change between pairs of bases opposed within secondary structure models, corresponding to changes among three classes of opposed bases: Watson-Crick paired bases, wobble (G.U) base pairs and unpaired pairs of bases. Changes of state were identified using MacClade (version 3.04; 20). Base pairings were coded as unordered multistate character states: codes 1-4 corresponded to Watson-Crick pairs (A-U, U-A, C-G and G-C, respectively), and 5 and 6 to wobble pairs (G.U and U.G, respectively). The remaining four states for character coding available in MacClade were used for non-pairing combinations on an ad hoc basis for each stem. For the subsequent statistical analysis, changes between pairs whose direction was unambiguous in the context of the phylogenetic tree (19; Fig. 2) were then counted. To minimise effects of double mutational hits, D.melanogaster and T.molitor sequences were not included in the statistical analysis because of their great evolutionary distances from the remainder of the taxa considered.
Statistical analysis of changes
Frequencies of different classes of change were compared with expectations using the [chi]2 test (see legend to Table 3 for details). Expected values were calculated based on the same total number of changes and using three models for the frequencies of change. Under the first model used (M1), all changes at covarying sites were assumed to be the result of a single base change in one of the covarying partners. Under this model, a Watson-Crick base pair can undergo 12 possible changes by single mutations, of which 10 result in non-compensatory changes and two give rise to G.U pairs (P = 0.833 and 0.167, respectively). Similarly, G.U pairs can give rise to six products, of which two are Watson-Crick pairs and four are unpaired (P = 0.333 and 0.667, respectively). The second model (M2) was a modification of the first model which assumed that transitions were twice as likely to occur as transversions. Under this model a Watson-Crick pair has a 4/16 (P = 0.25) probability of changing to a wobble pair and a 0.75 probability of giving rise to an unpaired combination. Wobble pairs give rise to Watson-Crick and unpaired combinations at probabilities of 0.357 and 0.643, respectively.
Figure 2. Phylogenetic tree of species considered in this analysis and distribution of the commonly found potential secondary structures. The phylogenetic tree is re-drawn from Vogler and Pearson (19). Nodes identifying traditionally recognized groupings are indicated and individual branches are numbered to allow their identification in Table 1. The grid on the right hand side represents whether or not a given structural element was found in MFOLD predictions for the sequence from a particular species. Each column in the grid represents a particular structure (labelled from I to XI). A dark shaded box indicates that the structure was found in MFOLD analysis, a light shaded box that it was found in modified form and a white box that it was not found. Horizontal lines are drawn through rows corresponding to species for which MFOLD analysis was not carried out because sequences were incomplete. As well as changes that appeared to have taken place by single base mutations, the data also included a number of examples of changes that had to involve more than one base change. As we could not exclude the possibility that all the changes seen here, including those appearing to result from single base changes, in reality resulted from multiple changes, we also invoked a third model (M3) that made no assumption about the process of change. In this model, each pair can give rise to all 15 other possible pairs with equal probability, so that any Watson-Crick pair can give rise to the three other Watson-Crick pairs (P = 0.2), two wobble pairs (P = 0.133) and 10 unpaired combinations (P = 0.667). For wobble pairs, the predicted proportions are 0.267 for Watson-Crick pairs, 0.067 for the other wobble pair and 0.667 for unpaired bases.
RESULTS
One hundred and ninety seven distinct potential structural elements were identified in the search of minimum energy structures. Figure 1 shows a histogram of the frequency of structures found in different number of species. Of the 197 structures, 11 were found in 15 or more species (i.e. the majority) and were subjected to subsequent analysis. The species for which these 11 structures were predicted are identified in Figure 2 in the context of the likely phylogeny (19). In addition, Neefs and De Wachter (21) suggested that a pseudoknot forms at the 3[prime] end of the region. As pseudoknots are not detected by MFOLD, we added this structure to the analysis. Putative secondary structures for 10 of the 11 frequently found secondary structures plus the pseudoknot are shown in Figure 3. The bulk of stem II contained too many insertions/deletions (presumably resulting from slippage in this region; 9) to be aligned with confidence. A sequence alignment summarising the relative positions of these 12 structures in the V4 sequences is shown in Figure 4. The alignment used is essentially identical to figure 1 of ref. 9 except in the region of helix IX, where it has been modified to be consistent with conservation of this stem in other species.
Figure 3. Summary of the secondary structure model arising from these analyses. Stems are labelled according to the numbering system used in the Discussion and according to the model of De Rijk et al. (15) if also present in that model. Sequences used are majority rule consensi of the 32 sequences except for stem I, which is a consensus of 29 sequences. Paired bases are presented in upper case; unpaired bases in lower case. The 5[prime] to 3[prime] direction proceeds from the left to right hand side of the figure. Positions within secondary structural elements are numbered. Stems IV and IX are long-range interactions which overlap but were not unambiguously resolved. Stem IX, which is present in De Rijk et al.'s model (15), is shown in its correct position; the arrows represent the possible interaction between the component parts of stem IV which overlap with stem IX, and all component parts of this stem are contained within dotted boxes. Structures in the main part of the figure represent the final model arising from these analyses; the structures within the box were seen in more than half of MFOLD analyses but were not supported otherwise. Paired positions marked with an asterisk showed only semi-compensatory changes and those marked with a black dot showed at least one fully compensatory change. A structure for stem II is not included because it varied greatly between species. Examples of stem II structures for Odontocheila confusa, E.cupreus and P.fallaciosa are presented in figure 2 of ref. 9. Table 1. Figure 4. (above and opposite). Multiple alignment of cicindelid SSU rRNA V4 sequences plus D.melanogaster and T.molitor sequences (9). The positions of the structures identified in Figure 3 are indicated below the alignment. Because these potential structures overlap, individual structures have been coded using bold text, italics and single or double underlining. Loop regions of structures included in the final model (Fig. 3) are shown in outline text. The type of coding used for each potential structure is indicated on the stem notation below the alignment. Hyphens (-) indicate gaps introduced to optimize the alignment, except in the lines indicating structures, where they indicate the extent of the structure concerned. The numbers of species in which these putative structures were found, and the result of the analysis of compensatory and non-compensatory changes for each of them, can be summarised as follows (stem numbering as in Figs 3 and 4; details of all base changes within putative structures are summarised in Table 1): Because of the extreme divergence and multiple indels in this region, it was not possible to identify compensatory mutations in most of it. However, in the stem formed by the basal motifs, three of the four positions showed compensatory mutations. Only in P.fallaciosa (which showed additional potential secondary structures involving these sequences) was an unambiguous non-compensatory change seen, at position 1.
Co-variation analysis of these stems showed that for stem IV, positions 1-4 and 7 were invariant and positions 5, 8 and 9 showed semi-compensatory changes. Positions 5 and 6 showed single non-compensatory changes, while positions 8 and 9 could not form in D.melanogaster. Position 10 showed two changes from non-paired to potentially paired states (U.U->U-A) in the C.punctatoauratus lineage and after the divergence of the P.californicus lineage from the rest of the tree. All positions corresponding to stem V were invariant. In stem VI positions 2 and 4-7 were invariant. Positions 1 and 3 both showed single non-compensatory mutations: from U-A to C.A at position 1 in the lineage leading to E.cupreus and C.punctatoauratus, and from C-G to C.A at position 3 in Amblychelia baroni.
Table 2.
Stem VII. This was the most frequently found structure 3[prime] to stem VI during energy modelling. It was found in 20 species, and was missing mostly in taxa basal on the tree (Fig. 2). The structure presented in Figure 3 is a structure with a long stem found in 10 of these 20 species. Other species showed versions truncated by generally one or two base pairs. Positions 1, 2, 14 and 17 of the long form of this stem (Fig. 3) showed non-compensatory changes. Positions 4-7, 11-13 and 16 were invariant, while positions 3 and 8-10 showed semi-compensatory changes.
Analysis of full compensatory mutations
Table 1 shows that 10 base pairs in three of the predicted structures (stems I, II and III) showed fully compensatory changes within the Cicindelidae, providing prima facie evidence for their existence. In addition, stem IX showed a fully compensatory change at position 2 on branch 61 of the tree, which separates Tenebrio from the Cicindelidae. The majority of fully compensatory changes were seen as complete transformations between base pairs (e.g. a G-C to A-U conversion at position 3 of stem II in P.fulvia compared with its sister taxon Physodeutera alluaudi). However, we also observed three compensatory changes that, according to the reconstruction of character changes, proceeded via two semi-compensatory steps (i.e. via G.U intermediates), at positions 1 and 10 of stem I, and position 3 of stem II.
Statistical analysis of patterns of nucleotide change
A number of the commonly found potential stems did not show fully compensatory changes, and for stems III and IX, the fully compensatory changes that were observed occurred at the root of the tree only, raising the possibility that these structures were not conserved within the Cicindelidae. We therefore carried out a statistical analysis of the pattern of base changes within the Cicindelidae to investigate whether these patterns provided any additional evidence for or against the existence of any of the other structures suggested by MFOLD analysis.
The aim of the statistical analysis was to test whether the patterns of nucleotide changes observed on the phylogenetic tree deviated significantly from random expectation. We defined our random expectations according to three null models, all of which assume that the probabilities of changes occurring between different base combinations are independent between sites (see Materials and Methods).
Table 3.
To test the data against these models, we counted only base changes that were unambiguous on the phylogenetic tree (Fig. 2). In addition we excluded T.molitor and D.melanogaster from the analyses, both to test for structures that might be specific to the Cicindelidae and because their great evolutionary distance from the Cicindelidae meant that we could not exclude multiple changes giving rise to apparent single-base changes. Table 2 summarises the numbers and types of changes observed for each of the variable stem positions. Table 3 gives the results of [chi]2 analysis of these frequencies against the three null models. The analyses under models 1 and 3 provided similar, but not identical, patterns of support. Stem I showed the strongest level of support (P <0.001), while stems II, III, IV, IX and XII reached the P <0.01 level under at least one model. Stems VII and X reached only low levels of significance (P <0.05). Model 2, which represents a modification of model 1 in which the transition:transversion ratio is set to 2, supported only stem I strongly, and stems IX and XII weakly.
DISCUSSION
Our aim was to examine the utility of a method for modelling the secondary structures in highly variable regions where alignment and the determination of positional homology are unreliable. We tested the method against a well-established structural model and investigated levels of support for the resulting model using comparative phylogenetic data. We first discuss the secondary structure model that our analyses produced, then the method's usefulness for modelling variable region structure.
Secondary structure model
Of the eight elements of secondary structure contained in the current structural model of V4 (14), six were among the structures found most commonly in MFOLD analyses of cicindelid sequences (Fig. 3). E23-1, -2, -5, -6, -7 and -8/9 correspond to our structures I, IX, III, V, VI and XII, respectively. Of these, stem I was strongly supported by analysis of full compensatory mutations, as was stem II. Stems III and IX were more weakly supported by compensatory mutations.
Stems I, II, III, IV, IX and XII were also supported to at least the P <0.01 level by statistical analysis under at least one of the models. Stem V (E23-6) was not testable in this set of species as this region of the sequence showed complete conservation. Stem VI (E23-7) was supported poorly by statistical analysis as it showed two non-compensatory and no other changes. In the species in which these changes had taken place (A.baroni, C.punctatoaureus, E.cupreus) the potential structure reduced to a conserved core corresponding to base pairs 4-7 in Figure 3, indicating that the base of this structure is either a spurious result from the minimum energy calculations or of limited taxonomic distribution.
Our study provided support for two structures that are not included in the existing model. The first of these was stem IV, which could form from sequences flanking stems V and VI to form a Y-shaped structure. This overlapped the 3[prime] strand of stem IX (E23-2), which was supported by a compensatory change at the root of the tree. Both stems IV and IX were supported by statistical analysis, which was therefore unable to unambiguously distinguish between the two. A fully compensatory change appears to provide stronger evidence for stem IX than stem IV. However, as this compensatory change affects a change at the outgroup node only, whereas the statistical analysis of semi-compensatory changes supports the presence of stem IV within the ingroup, we cannot exclude the possibility that stem IV rather then stem IX occurs in the ingroup. We have therefore included both structures as alternatives in Figure 3.
The second difference between our analysis and the model of Van de Peer et al (14,15) lay in the central, highly variable region of V4 between stems I and III. We consistently found a single, long stem to form in this region (stem II) which was supported by analyses of both fully and semi-compensatory changes. Stem II corresponds to a structure we proposed previously for this region based on a less rigorous analysis (9). Van de Peer et al.'s model of this region (14) contains two stems: E23-3 and E23-4. Both of these structures appear in only a small number of insect species (E23-3 is described for four species: Acyrthosiphon pisum (pea aphid), D.melanogaster, Meloe proscarabaeus and T.molitor; E23-4 only for D.melanogaster). These two stems are made up by the two complementary halves of our stem II in D.melanogaster. Stem II was observed in 26 of the species analysed here, contained numerous fully compensatory changes and reached the P <0.01 level of support based on only four base pairs. As none of the 197 minimum or near-minimum energy structures we considered during modelling gave rise to a pair of stems corresponding to the two halves of stem II, we saw no evidence to support the double structure in the cicindelid rRNA. Its recognition in D.melanogaster may reflect a restricted taxonomic distribution; statistical analysis of covariation at lower hierarchical levels, for example within closely related drosophilids, may provide the evidence to decide between competing stem structures. However, as our stem II has homologues in D.melanogaster and T.molitor, and given the very restricted taxonomic distribution of stems E23-3 and E23-4, we believe there is strong prima facie evidence for a long, variable stem in the central region of insect V4.
Although stem XII was clearly homologous to the pseudoknot structure described previously (14,21), position 9 in stem XIIa and positions 5 and 7 in stem XIIb did not show strongly conserved patterns of pairing (Table 1). Position 9 of stem XIIa shows two non-compensatory changes: one in C.repanda and one in the lineage leading to E.cupreus and C.punctatoauratus. In D.melanogaster this site also neighbours a single base insertion that would have to be looped out of the structure. This position within stem XIIa is therefore not necessarily paired and can accommodate mispairing and looping out of at least one base. Position 7 of stem XIIb shows much less tendency to be paired than other positions within stem XII, being transformed to a U.U mispairing in numerous cases. This again suggests that this position is not necessarily paired. The fact that a base insertion is observed in D.melanogaster immediately neighbouring position 7 suggests that extrahelical bases can be tolerated in this region of the structure. Position 5, which flanks a bulged motif that also varies in length, also showed a single non-compensatory change, from G-C to G.A. However, it remains possible that these positions adopt atypical helix conformations, as C.A, U.U and G.A pairs can form under some circumstances (23).
Taking the above considerations into account, our model for the V4 region of the Cicindelids is as presented in Figure 3. We regard stems I-III and XII as well supported, but the structure of the region made up of stems V-VI is presented only for illustration, as these structures were either untestable or poorly supported by our analyses. We have included both stems IV and IX although the longer range compensatory changes support the existence of stem IX in the broader sample of V4 structures. It is noteworthy that the structures included in the model all occurred in [ge]26 of the 29 MFOLD predictions ([sim]90%) whereas stems VII, VIII and XI, which were excluded, occurred in fewer predictions (21-24). Thus the frequency of occurrence of any structure in MFOLD predictions may be a direct indicator of the likelihood of its occurrence. Stem X was intermediate, occurring in 25/29 predictions (86%). This structure is excluded by stem IV, with which it overlaps, but not by stem IX, and may occur if stem IX occurs.
Three other models of this region have been proposed: Hancock et al.'s original model for D.melanogaster (24), an alternative comparative model (25) and a model for the highly expanded V4 in Acyrthosiphon pisum (26). The model developed in this paper shares stems II, III and V with Hancock et al.'s model (24), which also identified stems VI, VII and VIII as possible structures. Nickrent and Sargent's model (25) clearly shares stems III and V and a stem similar to VI with the current model; other stems in their model resemble our stems I, VIII and X although homology is difficult to ascertain as the structures published were for species highly divergent from insects. Similarities to the Kwon et al. model (26) are not obvious.
Statistical analysis
Support for secondary structures has conventionally been based on the accumulation of fully compensatory changes. A problem with this approach, which is intuitively attractive, is the difficulty in handling the occasional non-compensatory change (21) and the zero-weighting accorded to semi-compensatory changes. The statistical approach we have used here allows account to be taken of these other kinds of change and, by making predictions about the expected ratio of the different kinds of change, allows testing of the nature of sequence change in putatively base-pairing regions. Our method makes use of phylogenetic trees to infer the character changes and their directions. This permits the determination of the number and types of mutational steps, which in turn can be used for inferences about underlying mutational processes such as the action of slippage-like processes. Comparison with a model of random change then allows assessment of deviations from null expectations and a calculation of statistical support for the various secondary structures suggested from the minimum energy analysis.
We investigated three different models of change here. Models 1 and 2 assumed that semi-compensatory changes resulted from single base substitutions only, with different assumptions about the relative frequencies of transitions and transversions. Model 2, which assumes a 2:1 transition to transversion ratio, produced much lower levels of support for individual structures. This is unsurprising as the base changes that convert between Watson-Crick and wobble pairs are transitions (C[harr]U; G[harr]A). This underlines the necessity of using an appropriate model of random mutation in such studies. Model 3, which makes no assumptions about the number of base changes giving rise to a change in any given base pair, provided much stronger support for stem II than models 1 and 2. This is consistent with model 3 being a more suitable model for sets of sequences that are so rapidly evolving, or so distant, that multiple base changes are likely to have occurred during their evolution. It also provides a means of testing whether mutations within a given structure are approaching saturation, which might be of value for phylogenetic analysis.
As rRNA sequences are widely used for phylogenetic reconstruction, data sets are increasingly amenable to this analysis. Consideration of the distribution of the structures on the phylogenetic tree, and of non-compensatory changes within the structure, may provide more information about the reality of a given weakly supported structure. We applied statistical tests to whole structures observed in MFOLD modelling. This approach is necessary when data (i.e. variation) is limited to best measure the support accruing to a particular structure. However, it has the disadvantage that it cannot characterise the support for a particular paired position within a structure. Such an approach might be possible for larger, and/or more variable, datasets.
Four of the six stems supported strongly by our statistical approach (i.e. at the P <0.01 level under at least one model) also showed at least one fully compensatory change, providing evidence that the two approaches measure the same properties of sequences. However the pseudoknot stem XII showed no compensatory changes in our dataset. Despite this, it was supported by statistical analysis. Given that this structure is strongly supported in a wide range of other species, this suggests that the statistical approach is more sensitive than simple measurements of compensatory changes. This is not surprising as fully compensatory changes involve at least two coordinated base changes while changes involving G.U base pairs can involve only single base changes.
Like any other method based on sequence comparison, this method relies on sufficiently high rates of nucleotide change. Cases in which only a few changes are observed can give rise to Type II error (failure to detect a true deviation from random expectation). As well as being sensitive to low rates, the method is also sensitive to high rates of change. This is seen for the basal bases of stem II, which showed many more changes than the other structures analysed here, including many more completely compensatory mutations. This indicates high rates of sequence change in this part of the expansion segment and high rates of double mutation, and coincides with the disparity observed between the [chi]2 values obtained under models 1 and 3 for the four base pairs forming the base of stem II.
In conclusion, the use of minimum energy calculations combined with analyses of compensatory and semi-compensatory changes has allowed us to use a relatively small dataset to recover many (but not all) features of a secondary structure model that is based on a wider ranging set of data. It has also revealed at least one possible difference between the broader model and the cicindelid sequences which may reflect differences in structure of variable region V4 in different evolutionary lineages, while providing evidence for a structure that forms from a highly variable part of the sequences that may not have been evident from more distant sequence comparisons (stem II). This approach may be useful in other studies in which data are limited or sequences are highly variable and difficult to align unambiguously.
ACKNOWLEDGEMENTS
We thank the UK Medical Research Council and Natural Environment Research Council (grant number 3/11362) for financial support. We also thank Donald Quicke for discussions.
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 24 Mar 1998
Copyright© Oxford University Press, 1998.
This article has been cited by other articles:
![]() |
G. ALKEMAR and O. NYGARD Secondary structure of two regions in expansion segments ES3 and ES6 with the potential of forming a tertiary interaction in eukaryotic 40S ribosomal subunits RNA, March 1, 2004; 10(3): 403 - 411. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. ALKEMAR and O. NYGARD A possible tertiary rRNA interaction between expansion segments ES3 and ES6 in eukaryotic 40S ribosomal subunits RNA, January 1, 2003; 9(1): 20 - 24. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. G. von der Schulenburg, J. M. Hancock, A. Pagnamenta, J. J. Sloggett, M. E. N. Majerus, and G. D. D. Hurst Extreme Length and Length Variation in the First Ribosomal Internal Transcribed Spacer of Ladybird Beetles (Coleoptera: Coccinellidae) Mol. Biol. Evol., April 1, 2001; 18(4): 648 - 660. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Wuyts, P. De Rijk, Y. Van de Peer, G. Pison, P. Rousseeuw, and R. De Wachter Comparative analysis of more than 3000 sequences reveals the existence of two pseudoknots in area V4 of eukaryotic small subunit ribosomal RNA Nucleic Acids Res., December 1, 2000; 28(23): 4698 - 4708. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





