Nucleic Acids Research Advance Access originally published online on September 8, 2006
Nucleic Acids Research 2006 34(16):4630-4641; doi:10.1093/nar/gkl535
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2006, Vol. 34, No. 16 4630-4641
© 2006 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Genomics |
Aberrant 3' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization
echovsk
University of Southampton School of Medicine, Division of Human Genetics Mailpoint 808, Southampton SO16 6YD, UK
Tel: +44 2380 796425; Fax +44 2380 794264; Email: i.vorechovsky{at}soton.ac.uk
Received June 1, 2006. Revised July 10, 2006. Accepted July 11, 2006.
| ABSTRACT |
|---|
|
|
|---|
The frequency distribution of mutation-induced aberrant 3' splice sites (3'ss) in exons and introns is more complex than for 5' splice sites, largely owing to sequence constraints upstream of intron/exon boundaries. As a result, prediction of their localization remains a challenging task. Here, nucleotide sequences of previously reported 218 aberrant 3'ss activated by disease-causing mutations in 131 human genes were compared with their authentic counterparts using currently available splice site prediction tools. Each tested algorithm distinguished authentic 3'ss from cryptic sites more effectively than from de novo sites. The best discrimination between aberrant and authentic 3'ss was achieved by the maximum entropy model. Almost one half of aberrant 3'ss was activated by AG-creating mutations and
95% of the newly created AGs were selected in vivo. The overall nucleotide structure upstream of aberrant 3'ss was characterized by higher purine content than for authentic sites, particularly in position 3, that may be compensated by more stringent requirements for positive and negative nucleotide signatures centred around position 11. A newly developed online database of aberrant 3'ss will facilitate identification of splicing mutations in a gene or phenotype of interest and future optimization of splice site prediction tools. | INTRODUCTION |
|---|
|
|
|---|
Mutations that affect pre-mRNA splicing have been shown to account for up to a half of disease-causing gene alterations (1,2), potentially representing the most frequent cause of hereditary disorders (3). The most common consequence of splicing mutations is skipping of one or more exons, followed by the activation of aberrant 5' (donor) splice sites (5'ss), 3' (acceptor) splice sites (3'ss) and retention of full introns in mRNA (4,5). Each of these four events may have a dramatic impact on the structure or outcome of mature transcripts, function of their translation products and phenotypic manifestations. However, gene mutations or variants can also have more subtle effects at the level of splicing by altering the expression of pre-existing alternatively spliced mRNA isoforms, which can considerably modify not only phenotypic severity of both Mendelian and complex traits, but also their population prevalence (69).
Mutation-induced aberrant splice sites have been classified into two categories (10): (i) cryptic splice sites, which are only used when a mutation disrupts use of the authentic site, and (ii) de novo splice sites, which are induced by mutations elsewhere in introns or exons and increase the match to a splice site consensus. However, distinction between the two categories may be ambiguous in some cases since disruption of the authentic site may create a new splice site consensus, and is less obvious for 3'ss than 5'ss because accurate recognition of acceptor sites requires additional signal sequences in introns (11). The splicing signals of acceptor sites, namely the branch point sequence (BPS), polypyrimidine tract (PPT), and 3'AG, are recognized by RNAprotein interactions involving splicing factor 1 (SF1) and 65 and 35 kDa subunits of the U2 small nuclear RNP auxiliary factor (U2AF65 and U2AF35), respectively (1217). The overall strength of 3'ss is defined by optimal sequences for interaction with each cognate factor as well as their distances from each other (18,19).
Cryptic 5'ss have a similar frequency distribution in exons and introns and their number decreases with an increasing distance from authentic 5'ss (10). In contrast, the localization of cryptic 3'ss is biased towards exons, whereas de novo 3'ss usually reside in introns, particularly within the PPT of authentic 3'ss (11). The distribution bias and a lower prevalence of aberrant 3'ss than 5'ss in vivo is most likely due to sequence constraints near intron/exon boundaries, including depletion of AG dinucleotides and the presence of PPT and BPS upstream of 3'ss (11). In addition, the multifaceted distribution of aberrant 3'ss would be predicted to reflect variable distances between the 3'ss signal sequence from intron to intron (1821), including the presence of putative distant BPS that are not located within an optimal distance of 1840 nt 5' of 3'ss, but may reside up to several hundred nucleotides further upstream (22). Despite a growing number of reported splicing mutations and associated phenotypes, the localization of the resulting aberrant 3'ss and their effect on gene expression remain difficult to predict.
Currently available computational tools that estimate the splice site strength have been based on a variety of methods, including nucleotide frequency matrices (23,24), machine learning approaches (25), neural networks (26), information theory (27) and interdependence between adjacent (the first-order Markov model) or more distant (the maximum entropy model) positions of the splicing consensus sequences (28). Gene prediction algorithms that take into account protein coding information have been shown to perform better than those that rely only on signals present in the splice sites (29). However, the strength of mutation-induced aberrant splice acceptor sites has not been systematically analyzed, and it is unknown at present which models best predict the localization of cryptic or de novo 3'ss activated in vivo.
Here, nucleotide sequences of aberrant 3'ss that were reported previously in human disease genes have been compiled and made available to the public through an online retrieval tool. Comparison of the splice site strength using current prediction algorithms showed that the maximum entropy model allowed the best discrimination between authentic and mutation-induced aberrant 3'ss, validating this model as the most sensitive instrument. In addition, this study provides a detailed characterization of the underlying mutation pattern and comparison of nucleotide composition upstream of aberrant and corresponding authentic 3'ss.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Compilation of mutation-induced aberrant 3'ss in human disease genes
Published reports of cryptic and de novo 3'ss were identified by searching PubMed (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi) and home pages of peer-reviewed journals. A subset of case reports were identified by searching locus-specific mutation databases (http://archive.uwcm.ac.uk/uwcm/mg/docs/oth_mut.html). The search was restricted to human genes with sequence-verified aberrant RNA products published before May 2006 that resulted from disease-associated mutations or variants. Nine cases in which no patient RNA was available but aberrant RNA products of wild-type and mutated alleles were characterized in minigene splicing reporter assays were also included. Aberrant 3'ss were manually validated by mapping the information in the literature to sequences in the Human Genome Project databases. Nucleotide sequences of authentic, mutated and aberrant 3'ss are available at http://www.dbass3.soton.ac.uk/ in the first online database of aberrant 3'ss termed DBASS3.
Comparison of computational methods to predict aberrant 3'ss
Validated sequences of aberrant and corresponding authentic 3'ss were used as input files for several splice site prediction algorithms. The Shapiro and Senapathy (S&S) matrix is based on nucleotide frequencies at each position of the 3'ss consensus sequence (23,24). The S&S matrix scores were computed using an online tool available at http://ast.bioinfo.tau.ac.il/. The information theory-based server (27) available at https://splice.cmh.edu/ was used to obtain the information content (Ri) of 3'ss in bits. To accommodate dependencies between adjacent and non-adjacent positions, the compiled sequences were analyzed using the first-order Markov (MM) and the maximum entropy (ME) models (28). The former method considers dependencies between adjacent positions, whereas the latter approximates short sequence motif distributions with the ME distribution and may include dependencies between non-adjacent as well as adjacent positions. The MM and ME scores (28) were derived for each 3'ss using online tools available at http://genes.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq_acc.html. The Wilcoxon MannWhitney rank test (Stat-200, v. 2.01, Biosoft, UK) was employed to test the significance of score differences between authentic, mutated and aberrant 3'ss in each category.
DBASS3 construction
DBASS3 is an online retrieval and submission tool for mutation-induced aberrant 3'ss available at http://www.dbass3.soton.ac.uk/. The web application was created using the ASP server technology (Microsoft), and SQL database software (http://www.sql.org). In addition to aberrant 3'ss induced by germ-line and somatic mutations, DBASS3 contains naturally occurring variants common in the population if they have been convincingly shown to modify both alternative pre-mRNA splicing and disease phenotypes, such as FECH IVS3-48T/C in protoporphyria (8). Genetic polymorphisms that may influence utilization of tandemly arranged NAGNAG 3'ss (30) and exert putative functional effects have been reported elsewhere (31) and were not included in DBASS3, nor were the mutations leading to exon skipping or complete intron retention.
| RESULTS |
|---|
|
|
|---|
Mutations that activate aberrant 3'ss
An exhaustive search for previously published aberrant 3'ss identified 218 unique aberrant acceptors in 131 genes (Table 1). They were generated by a total of 16 deletions/insertions (3246) and 211 point mutations (Table 2). Single-nucleotide substitutions of purine residues were much more frequent than those of pyrimidines (165 versus 46, P < 1016). This overrepresentation was not attributable solely to substitutions at 3'YAG (102 versus 8), but was also observed for de novo 3'ss (63 versus 38, P = 0.004). The most frequently introduced base in each of the four categories of aberrant 3'ss was guanine (G), accounting for
42% (89/211) of all point mutations (Table 2).
|
|
As expected, point mutations were most common in highly conserved positions 1 (53/211; 25%) and 2 (48/211; 23%) relative to natural intron/exon junctions (Table 3). Position 3 was mutated in nine cases (
4%). As noted in the initial analysis of all splice site mutations for position 2 (47), G-to-Y (in position 1; Y is pyrimidine) and A-to-Y (position 2) transversions were under-represented as compared with G-to-A and A-to-G transitions, respectively (P < 0.01 and P < 0.00001, assuming that substitutions to the remaining nucleotides were equally probable; Table 3). Since transitions are in significant excess in humans compared with the expected frequency of 33% (47), the expected numbers were calculated for each substitution using previously published single-nucleotide mutability rates in disease genes (Table 3). However, the observed number of G1-to-T1 mutations was too low to be explained by chance, suggesting that primary transcripts carrying the A2T1 acceptors generate on average more canonical mRNAs as compared with 3'AG mutated to other dinucleotides, leading to a detection bias against less severe phenotypes. This notion is supported by similar frequencies of G>T/C>A and G>C/C>T alterations among disease-causing point mutations (48) and by the presence of residual amounts of natural transcripts in some 5'G+1T+23'A2T1 introns both in Saccharomyces cerevisiae (49) and humans (50). However, comparison of the observed and expected distributions derived from di-nucleotide mutability rates that allow for the influence of neighbouring nucleotides (48) failed to confirm any bias for both intron positions (Table 3). Thus, although small effects of leaky dinucleotides on the observed distribution cannot be excluded, these data are consistent with dramatic consequences for splicing of any point mutation in the highly conserved 3'AG and with indistinguishable defects of the second splicing step previously observed in vitro both for intron position 1 (51) and 2 (52).
|
Interestingly, as many as 14/53 (26%) point mutations in position 1 (IVS-1G>A if the first exon nucleotide was G) (34,5364), 3/48 (6%) substitutions in position 2 (IVS-2A>G) (65,66) and 2/9 (22%) point mutations in position 3 [IVS-3T>G (67) and IVS-3A >G(68)] created new 3'AG sites that were used in vivo (Figure 1). The proportion of AG-creating mutations in position 1 was higher than in position 2 (P = 0.01, Fisher's exact test), which may have contributed to the higher number of substitutions observed in position 1 than 2 (Table 3). In contrast to mutations in the 3'YAG consensus, the majority of substitutions in the PPT were AG-creating mutations. For example, in positions 5 to 26 relative to natural intron/exon junctions as many as 61/73 (84%) point mutations mutations created new AGs (Figure 1). The overall proportion of AG-creating mutations that resulted in aberrant 3'ss was 43%, and
95% of the newly introduced 3'AGs were used in vivo (Table 2).
|
Purine transitions, which accounted for
54% (113/211) of all aberrant 3'ss and dominated the mutation pattern of cryptic 3'ss, were also the most frequent point mutations leading to de novo 3'ss (54/101; 53%). De novo sites in introns resulted from purine transitions more often than de novo sites in exons (45/72 versus 9/29,
2 = 7.0, P = 0.008). Intronic de novo 3'ss were most frequently induced by substitutions of A (29/72, 40%), whereas exonic de novo 3'ss were most commonly activated by point mutations of G (13/29, 45%; Table 2).
Comparison of computational tools to predict mutation-induced aberrant 3'ss in vivo
The predicted strength of aberrant, mutated and corresponding authentic 3'ss was analyzed using publicly available computational tools shown in Table 4. Each of the tested models distinguished authentic, mutated and aberrant 3'ss, with authentic sites giving, on average, the highest scores or information bits, followed by aberrant and then by mutated 3'ss (Table 5). However, this was not the case for each category of aberrant acceptors.
|
|
First, each computational tool was more effective in discriminating authentic and aberrant 3'ss that resulted from mutations in the 3'YAG consensus than from mutations elsewhere (Table 5). This was owing to significantly higher scores for authentic 3'ss that corresponded to cryptic 3'ss than for authentic counterparts of de novo sites. For example, the S&S scores for authentic counterparts of de novo and cryptic 3'ss were 80.5 ± 8.4 (±SD) and 84.6 ± 6.4 (P < 107, Wilcoxon MannWhitney rank test), respectively. Similarly, the ME scores were 7.2 ± 3.2 and 8.6 ± 3.3, respectively (P < 107). In contrast, the score differences between cryptic and de novo 3'ss were not statistically significant (means of the S&S matrix scores were 76.5 versus 77.7, P = 0.3; means of the ME scores were 4.7 versus 5.3, P = 0.4, respectively). Scores or information bits for each category of aberrant acceptors are shown in Table 4. These results indicate that authentic counterparts of de novo 3'ss are intrinsically weak and can be outcompeted by newly created splicing consensus elements. They also suggest that mutations or genetic variants flanking weak splice sites are more likely to play a role in regulated splicing than those near well-defined sites, consistent with weakening of splicing signals in evolution from virtually invariable sequences in yeasts to highly degenerate in humans and a need for more sophisticated regulation in complex organisms at the level of alternative splicing.
Second, each algorithm could distinguish cryptic and authentic 3'ss in exons, whereas matrix-based scores struggled to differentiate between authentic and cryptic 3'ss in introns where the ME and MM were the only models that showed P-values of 0.01 or lower (Table 5).
Third, de novo 3'ss could not be discriminated from authentic sites by any algorithm if located in exons. Although this could be partly attributed to a smaller sample size of exonic than intronic de novo sites (Table 1), a similar sample of intronic cryptic 3'ss did show statistically significant differences for a subset of algorithms (Table 5). Finally, the difference between intronic de novo sites and their authentic counterparts was statistically significant with the ME and MM models but not with the remaining algorithms, except for the S&S matrix scores.
Taken together, these results indicated that the value of computational tools to predict aberrant 3'ss depended on their localization in introns and exons as well as on the underlying mutation, and that the ME was the best model discriminating mutation-induced aberrant 3'ss in vivo from corresponding authentic 3'ss. They also suggested that the failure to distinguish exonic de novo 3'ss from authentic counterparts may be due to our as yet incomplete understanding of the role of exonic splicing silencers or enhancer elements in 3'ss selection.
Single-nucleotide composition upstream of aberrant 3'ss
Comparison of the nucleotide structure upstream of aberrant and authentic 3'ss revealed a significantly higher proportion of purines in aberrant 3'ss. For example, in intronic positions 3 through 26 aberrant 3'ss had 1760 purines as opposed to 1526 purines in authentic 3'ss (
2 = 23.7, P < 0.00001; Supplementary Figure 1). Overall, this was attributable to a higher number of As (
2 = 13.5, P < 0.001) rather than Gs (
2 = 6.4, P = 0.01; Supplementary Figure 1A). The increase of purine residues was almost exclusively at the expense of uridines for aberrant 3'ss in exons (Supplementary Figure 1B and C). In contrast, aberrant 3'ss in introns showed only a borderline increase of purine residues (
2 = 3.2, P = 0.07), largely owing to cytosine depletion (Supplementary Figure 1D and E). De novo 3'ss in exons had a smaller number of Gs as compared with authentic 3'ss, but the difference was not statistically significant (Supplementary Figure 1C).
The increase of purines in aberrant 3'ss was the highest in position 3 where As were 6x more frequent than in authentic 3'ss (Supplementary Figure 2A,
2 = 26.5, P < 0.000001). The number of aberrant 3'ss with G in position 3 was also higher (7 versus 2) in aberrant (65,66,6973) than in corresponding authentic (70,74) 3'ss. Positive associations between 3C and upstream Cs in the PPT and between 3T and upstream Ts, which were described previously for authentic 3'ss (75), were observed also for aberrant 3'ss (Supplementary Table 2). Although the influence of 3C on the relative usage of C versus T in the PPT may be attributed to autocorrelation due to compositional similarities of local genomic regions (75), sequence constraints resulting from cooperative interactions at the 3'ss could not be excluded. Indeed, non-random distributions at 3 observed for positions 11, 12, 17 and 19 of aberrant 3'ss (Supplementary Table 2) may be explained by inefficient binding of U2AF to RNAs carrying 3'TAG as compared to 3'CAG (76) and a need for functional compensation of the former by stronger interaction of U2AF65 (or other PPT-binding proteins) with uridines at positions 11 and 12 rather than cytosines. Associations further upstream may involve similar compensation by more optimal BPS interactions with the RS domain of U2AF (7779) and/or, possibly, other BPS-interacting factors, including K- and Quaking-homology 2 domains of SF1 (12,80,81) or U2 small nuclear RNA (82,83). Similar associations at 3 with upstream intron positions were seen also for authentic counterparts of aberrant 3'ss (data not shown), confirming previous findings with a larger dataset (75).
Although most of the analyzed positions upstream of aberrant 3'ss showed uridine depletion as compared to authentic sites (e.g. 565 versus 659 Ts in positions 5 to 10;
2 = 12.8, P < 0.001; Supplementary Figure 2A), their numbers were similar further upstream between positions 11 and 13 (311 versus 319, P = 0.7). Cs were slightly under-represented between positions 11 and 13 in aberrant 3'ss (168 versus. 202,
2 = 4.0, P = 0.04). The T-to-C ratio in aberrant 3'ss was the highest in position 11 (2.53 versus 1.70 in authentic), while the average (±SD) ratios between positions 4 and 26 in aberrant and authentic 3'ss were similar (1.52 ± 0.34 and 1.55 ± 0.28, respectively). Aberrant 3'ss with purine at 3 had higher T-to-C ratios between 11 and 13 than aberrant 3'ss with pyrimidine at 3 (2.54 versus 1.69). The number of Gs in this region was significantly higher in aberrant than authentic 3'ss (153 versus 92 in positions 9 to 12,
2 = 17.0, P < 0.0001), particularly in cryptic sites, whereas the number of As in these positions was not different (125 versus 115,
2 = 0.4, P = 0.5).
Di-nucleotide composition upstream of aberrant 3'ss
The number of AG dinucleotides, which are depleted in AG exclusion zones upstream of authentic 3'ss (24,75,84), was significantly higher in aberrant than corresponding authentic 3'ss (Supplementary Figure 2B). In a 17 nt sequence upstream of 3'ss where the AG depletion in natural 3'ss is the most pronounced (75), the numbers of authentic and aberrant 3'ss with a non-3'ss (intervening) AG were 15 and 36, respectively (
2 = 8.8, P = 0.003), while the number of AGs in the two groups was 15 and 40 (binomial test, P = 0.0003). The observed frequency of authentic 3'ss with non-3'ss AGs in this region (
16%) was similar to those previously reported for constitutively (14%) and alternatively (17%) spliced introns that contained intervening AGs downstream of predicted BPS (21). Between positions 3 and 26, there were 53 versus 80 AG-containing 3'ss (
2 = 7.2, P = 0.007) and 64 versus 95 intervening AGs (binomial test, P = 0.003), respectively. No AG dinucleotides were found in positions 10 and 11 of aberrant 3'ss. Although the number of intervening AGs was low, putative differences of these and other purine dinucleotides between aberrant and authentic in intron positions 25, 24, 22, 20 or 19 upstream of 3'ss are consistent with a distinct average distance of the BPS from aberrant versus authentic 3'ss. Peak frequencies of the GA and AA dinucleotides that may signify the presence of branchpoint in the mammalian BPS consensus YNYURAY were shifted several nucleotides upstream in aberrant 3'ss (Supplementary Figure 2B).
The remaining purine dinucleotides were also more common in aberrant than in authentic sites. The increase of AA dinucleotides (253 versus 185 in positions 26 to 3, P = 0.001), which were found in excess upstream of authentic 3'ss as compared to pseudo-sites (75), was largely attributable to position 3 due to the excess of 3As in aberrant 3'ss (Supplementary Figure 2A, B). The GG dinucleotides (186 in aberrant versus 118 in authentic sites in the same region, P < 0.0001) also clustered in some positions, such as 17 to 21 (56 versus 19,
2 = 17.9, P < 0.0001) and 8 to 12 (49 versus 26,
2 = 6.7, P < 0.01, respectively).
A region upstream of 3'ss in vertebrates (75) and Arabidopsis thaliana (85) contains a higher number of TG dinucleotides as compared to pseudo-splice sites, suggesting that they are important for correct 3'ss recognition. Although the total number of TGs in positions 3 to 26 was similar in aberrant and authentic 3'ss (430 versus 428), there were 94 and 60 TGs in positions 10 to 13 in aberrant and corresponding authentic sites, respectively (
2 = 7.7, P = 0.005). The number of GTs in the same region was also higher in aberrant sites (56 versus 34;
2 = 5.2, P = 0.02). In contrast, the number of TTs in the same region was similar (235 versus 200, P > 0.05) both in cryptic and de novo sites, whereas aberrant 3'ss showed TT depletion for most of the remaining positions. The number of CC dinucleotides between position 10 and 13 was lower in aberrant 3'ss (71 versus 99,
2 = 4.7, P = 0.03), but this difference was limited to de novo sites (
2 = 10.7, P = 0.001). The TT-to-CC ratio in aberrant 3'ss was the highest in position 12 (8.14 versus 2.76 in authentic), whereas the average (±SD) between positions 5 to 26 was 2.26 (±1.43), with 2.21 ± 0.56 in authentic counterparts.
Position 11 shows peak uridine frequencies in vertebrate PPTs (86), most probably due to highly conserved interactions with the second RNA recognition motif (RRM2) of U2AF65, a central organizing force for 3'ss recognition in higher eukaryotes, or with competing pyrimidine-binding proteins (14,87,88). The same position was efficiently crosslinked to RRM2 of U2AF65 in several PPTs (87) and substitutions of T11 generated lower levels of spliced products and prespliceosomal complexes than identical mutations of T8 or T14 (89), suggesting that the observed single- and di-nucleotide imbalances between aberrant and authentic 3'ss centred around this position have functional significance. Higher T-to-C and TT-to-CC ratios in aberrant 3'ss in this area are proposed to improve these interactions and functionally compensate their less favourable sequence context (Supplementary Figure 2A and Tables 4 and 5). The difference in the number of C12C11 between aberrant and authentic 3'ss (7 versus 21,
2 = 6.4, P = 0.01) suggests that this di-nucleotide does not sufficiently promote U2AF binding and that at least one uridine is required in either position for the productive interaction since the numbers of T12C11 or C12T11 were not significantly different in aberrant and authentic 3'ss (Supplementary Figure 2B). This notion is in agreement with
80- to 100-fold inhibition of U2AF65 binding following chemical modification of the uridine N3 and O4 atoms, the only positions that differ between the two nucleosides (90). However, the CC dinucleotides in positions 11 to 13 were over-represented in authentic counterparts of de novo sites (53 versus 21,
2 = 13.7, P < 0.001) but not cryptic sites (19 versus 26, P > 0.05), suggesting that they signify natural 3'ss that compete poorly with and may be susceptible to mutation-induced 3'ss.
In contrast to cytosines, both de novo and cryptic 3'ss showed an increase of TGs/GTs between positions 10 and 13 (64 versus 40,
2 = 5.4, P < 0.05 and 86 versus 54,
2 = 7.4, P < 0.01, respectively) as compared to authentic counterparts. A relative lack of G12T11/T11G10 in authentic sites suggests that such 3'ss may compete relatively well with newly introduced 3'ss, consistent with an earlier observation that GU tracts can substitute for pyrimidine tracts (91), probably as a result of flexible side chain rearrangements of U2AF65 and/or relocation of bound water molecules (92).
Depletion of aberrant 3'ss upstream and downstream of authentic 3'ss
Distribution of the distances between aberrant and authentic 3'ss with the updated sample confirmed a previously reported (11) bias of cryptic 3'ss towards exons and de novo sites towards introns (Supplementary Figure 3A and B). Major frequency peaks for cryptic and de novo 3'ss were 8 and 10 nt from authentic 3'ss, respectively (median distances in each category of aberrant 3'ss are in Table 1). In addition, a relative depletion of both in cryptic and de novo 3'ss emerged further upstream and downstream. A lack of cryptic 3'ss upstream is apparently due to AG depletion (11), although cryptic 3'ss activation may also be prevented by spliceosomal complexes assembled around the branch site. The latter explanation is likely to account for the observed depletion of de novo 3'ss, which is more upstream as compared to cryptic sites (
50 nt, Supplementary Figure 3B).
Smaller areas of depletion for cryptic 3'ss 3040 nt downstream of authentic 3'ss and
20 nt downstream for de novo sites was followed by a second peak at 5060 nt. The exonic depletion may be explained by a lack of suitable alternative BP adenosines within an optimal distance from de novo 3'ss, cross-exon interactions, selection against codons carrying AGs or a combination of these factors. In contrast to asymmetric distribution of cryptic and de novo 3'ss, the frequency plot of all aberrant 3'ss was virtually symmetric, with a median distance of just 1 nt from authentic 3'ss (Table 1 and data not shown). Finally, the observed frequency distribution suggests that aberrant 3'ss retaining the BPS and PPT of their authentic counterparts may be more frequent than those that use a new BPS-PPT-3'AG unit.
DBASS3: a database of aberrant 3'ss
Nucleotide sequences of all aberrant 3'ss were compiled in a new online resource available at http://www.dbass3.soton.ac.uk/. The DBASS3 web interface provides access to the database through the search option. The user can search DBASS3 by phenotype, gene designation, mutation, location of aberrant 3'ss and their distance from authentic 3'ss. Aberrant 3'ss generated in terminal exons can also be easily retrieved. In cases in which a search identifies more than one database entry, the results page displays the gene, phenotype and location of aberrant 3'ss for all corresponding hits. The user can then choose details pages that show nucleotide sequences flanking the authentic and cryptic 3'ss, literature references with PubMed links and the estimated strength of each splice site for the tested algorithms. In addition, the details page shows how aberrant 3'ss change the reading frame of each transcript (0, +1 and +2 nt). DBASS3 visitors can also submit published data to the corresponding author and receive regular updates by email. Potential applications of DBASS3 include the optimization of computational tools for prediction of aberrant splice sites, detection of introns or exons that are frequently involved in aberrant splicing, identification of splicing mutations and aberrant 3'ss in a gene or phenotype of interest, and investigating basic mechanisms of 3'ss selection.
| DISCUSSION |
|---|
|
|
|---|
A high proportion of AG-creating mutations activating aberrant 3'ss
This study is the first to provide a detailed survey of mutations leading to aberrant 3'ss. It showed that the distribution of single-nucleotide substitutions roughly reflected the degree of conservation of consensus sequences that define 3'ss (Figure 1) and revealed a high proportion of mutations creating the 3'AG consensus (Table 1). The observed frequency of AG-creating mutations (42%) was considerably higher than the estimated
13% in the initial analysis of splicing mutations (47). Only
5% (n = 11, Table 2) of these mutations failed to activate de novo 3'ss in situ and instead induced one or more aberrant 3'ss upstream (36,70,9396) or downstream (62,97,98) of the newly introduced AGs. These mutations were in position 3 (36,93), 9 (62,98), 10 (96), 14 (70), 15 (97), 17 (95) and 24 (94) relative to authentic 3'ss (Supplementary Table 1). Mutations in positions 3 and 24 directly inactivated 3'YAG and BPS, respectively, but the remaining AG-creating mutations were all in AG-exclusion zones downstream of the BPS. The distance between predicted BP adenosine and new 3'AG/ was 920 nt (Supplementary Table 1). Aberrant 3'ss with the BP-new AG distances between 9 and 16 nt were either in exons or upstream of the BPS, and new AGs were never selected as 3'ss, consistent with protein complexes bound to
19 nt region downstream of BP (99). In the FALDH gene (70), this distance was 20 nt and normally silent AG located 9 nt downstream of the BPS was activated by the newly created AG further 11 nt downstream. However, this putative exception can be explained by inefficient recognition of new 3'AG, which was preceded by G, unlike the remaining aberrant 3'ss (Supplementary Table 1). Alternatively, selection of aberrant 3'ss in this FALDH intron can be explained by almost identical BPS sequences arranged in tandem, with the upstream BP in the optimal distance (18 nt) from aberrant 3'ss. In contrast, wild-type AGs 6 and 7 nt downstream of the predicted BP were not selected (36,98). Although the location of AG exclusion zones is likely to be substrate-dependent, these data suggest that the average zone is between
7 and
19 nt downstream of the BP adenosine, consistent with previous studies of intervening AGs (11,19,21,99).
Selection of cryptic 3'ss upstream of BPS
If 3'ss are selected by unidirectional scanning for 3'YAG downstream of the BPS (91), why are so many cryptic 3'ss upstream of the predicted BPS used in vivo? Inspection of downstream exonic sequences in 29 cases of intronic cryptic 3'ss (Table 2) showed that eight were in terminal introns (67,100106) (Table 1), which was significantly more frequent (
2 = 5.6, P = 0.018) than for the remaining categories of aberrant 3'ss (Table 1), one was activated in a downstream intron (107) and two were associated with cryptic 3'ss in the following exon (108). Of the remaining sites, 13 cases either completely lacked exonic 3'YAG consensus in the context of four or more upstream pyrimidines or contained this consensus only in the last 20 nt of the exon (2,65,66,72,93,109116) These 3'YAGs are unlikely to be used as 3'ss given inefficient inclusion of very small exons in mRNA (117) and a typical recognition site of RRM of
47 nt [(87) and references therein]. This strongly suggests that the choice of upstream 3'ss is influenced by the availability of 3'YAGs in the downstream exon and their distance from the exon end, and is consistent with unidirectional scanning that is inefficient in terminal exons. It is therefore possible that a new, competing BPS-PPT-3'AG unit is selected after the initial scanning of the downstream exon for AGs is completed. However, there has been no obvious reason for using upstream 3'ss in at least some of the remaining introns (36,118,119). These rare cases and similar examples identified in the future might provide interesting insights into cellular mechanisms that discriminate between authentic 3'ss and pseudo-acceptors.
Random distribution of the reading frames in transcripts that use aberrant 3'ss
Aberrant splicing often results in transcripts containing premature termination codons (PTCs). Such transcripts are downregulated by nonsense mediated RNA decay (NMD), which degrades PTC-containing mRNAs whose translation may be deleterious for the cell (120). Whereas EST databases over-represent alternative splicing events that maintain the reading frame (121), neither cryptic 5'ss (10) nor aberrant 3'ss (Table 1,
2 = 8.2, 6 d.f., P = 0.2) (11) showed any bias against splice sites involving a frameshift with respect to the authentic sites, even though many mRNAs frameshifted by +1 and +2 nt would be expected to trigger NMD. These results can be explained by a great reduction of RNA downregulation in response to a PTC in transcripts containing PPT Y-to-R mutations that reduced splicing (122). In addition, NMD usually does not completely eliminate RNAs with PTCs and the activated cryptic sites that result in frameshifts can still be detected with RTPCR, a method used by the authors of most DBASS3 records.
The maximum entropy model as a method of choice for predicting aberrant 3'ss
This study demonstrated that the ability of current computational tools to predict utilization of aberrant 3'ss is influenced by their localization and the underlying mutation. The best overall model discriminating authentic and aberrant 3' ss was the ME model, validating previous predictions based on comparisons of genuine 3'ss and pseudo-acceptors (28). The ME model outperformed the remaining algorithms for each category of aberrant 3'ss and, together with the MM model, was the only method that could separate authentic from de novo 3'ss in introns at a significance level <0.01. Since none of the tested tools discriminated between de novo 3'ss in exons and their authentic counterparts (Table 5), these aberrant 3'ss were tested with additional algorithms, including NetGene2 (25,123) available at http://genome.cbs.dtu.dk/services/NetGene2/ and ASSP (alternative splice site predictor; http://es.embnet.org/~mwang/assp.html) method (124). NetGene2 considers more distant features that include global coding information and distances between potential splice sites, whereas ASSP is based on two neural networks pre-processed by position specific matrix scores. However, neither method revealed a difference for this category of aberrant 3'ss.
Although this study is the first to focus on 3'ss utilized in vivo as opposed to previous comparisons with pseudo-sites, there are limitations of this approach. First, even though each aberrant 3'ss was confirmed by sequencing, aberrant splicing was reliably and accurately quantified only in a subset of case reports and was highly variable from mutation to mutation, ranging from a few to hundred per cent utilization. This could be improved in future case reports and, as DBASS3 submissions permit inclusion of this information in future database records, taken into account in subsequent analyses. Second, despite the cell-specific nature of alternative splicing, measurements of aberrant and authentic RNA products have been obtained largely for blood leukocytes and only rarely for other cell types. Even with these limitations, future updates of DBASS3 may provide valuable insights into nucleotide dependencies between individual positions and distribution of trinucleotides that were significantly favoured or avoided upstream of authentic 3'ss as compared to pseudo-sites (75), as well as other motifs.
| CONCLUSIONS |
|---|
|
|
|---|
This work showed that (i) almost one half of aberrant 3'ss resulted from AG-creating mutations and from the introduction of guanosine, a virtually invariant nucleotide in both terminal positions of U2-dependent introns; (ii) the higher frequency of transitions over transversions observed for both positions of 3'AG can be attributed to relative di-nucleotide mutability rates rather than a detection bias resulting from a differential splicing efficiency of mutated 3'AGs; (iii) purine transitions leading to de novo sites in introns were more frequent than for de novo sites in exons; (iv) the maximum entropy model was the best model discriminating authentic and mutation-induced aberrant 3'ss used in vivo; (v) authentic counterparts of de novo 3'ss were intrinsically weak; (vi) the nucleotide sequence upstream of aberrant 3'ss had a higher purine content than corresponding authentic sites, particularly in position 3; (vii) as with authentic sites, aberrant 3'ss showed positive associations at 3 with upstream positions that may result from functional compensation of weaker interactions of U2AF with 3'TAG by stronger interactions with PPT uridines around position 11 and with more optimal BPS; (viii) the extreme rarity of AGs between positions 6 and 15 in authentic 3'ss (75,84) was violated in aberrant 3'ss, particularly 59 nt upstream of new intron/exon junctions; (ix) although uridines were generally under-represented upstream of aberrant 3'ss, they maintained their high numbers at position 11 and flanking nt for predicted interaction with U2AF65 or other PPT-binding proteins; (x) in this region, aberrant 3'ss had higher T-to-C and TT-to-CC ratios, required a complete lack of AGs, but tolerated more guanosines and UG/GU dinucleotides than authentic sites. Finally, the development and maintenance of DBASS3 will facilitate prediction of cryptic or de novo 3'ss in mutated disease genes, identification of introns or exons that are frequently involved in aberrant splicing, structural dissection of interactions leading to selection of 3'ss in vivo, and refinement of computational methods that estimate the splice site strength.
| SUPPLEMENTARY DATA |
|---|
|
|
|---|
Supplementary Data are available at NAR Online.
| ACKNOWLEDGEMENTS |
|---|
The author thanks Dr Christopher Smith, University of Cambridge, and Dr Ellen Copson, University of Southampton, for critical reading of the manuscript and helpful comments. The author also thanks Martin Chivers and Raj Sood for excellent technical help. This study was supported by the Juvenile Diabetes Research Foundation International (1-2006-263). The Open Access publication charges for this article were waived by Oxford University Press.
Conflict of interest statement. None declared.
| REFERENCES |
|---|
|
|
|---|
- Teraoka, S.N., Telatar, M., Becker-Catania, S., Liang, T., Onengut, S., Tolun, A., Chessa, L., Sanal, Ö., Bernatowska, E., Gatti, R.A., et al. (1999) Splicing defects in the ataxia-telangiectasia gene, ATM: underlying mutations and consequences Am. J. Hum. Genet, . 64, 16171631[CrossRef][ISI][Medline] .
- Ars, E., Serra, E., Garcia, J., Kruyer, H., Gaona, A., Lazaro, C., Estivill, X. (2000) Mutations affecting mRNA splicing are the most common molecular defects in patients with neurofibromatosis type 1 Hum. Mol. Genet, . 9, 237247
[Abstract/Free Full Text] . - Lopez-Bigas, N., Audit, B., Ouzounis, C., Parra, G., Guigo, R. (2005) Are splicing mutations the most frequent cause of hereditary disease? FEBS Lett, . 579, 19001903[CrossRef][ISI][Medline] .
- Krawczak, M., Reiss, J., Cooper, D.N. (1992) The mutational spectrum of single base-pair substitutions in mRNA splice junctions of human genes: causes and consequences Hum. Genet, . 90, 4154[ISI][Medline] .
- Nakai, K. and Sakamoto, H. (1994) Construction of a novel database containing aberrant splicing mutations of mammalian genes Gene, 141, 171177[CrossRef][ISI][Medline] .
- Cooper, T.A. and Mattox, W. (1997) The regulation of splice-site selection, and its role in human disease Am. J. Hum. Genet, . 61, 259266[ISI][Medline] .
- Nissim-Rafinia, M. and Kerem, B. (2002) Splicing regulation as a potential genetic modifier Trends Genet, . 18, 123127[CrossRef][ISI][Medline] .
- Gouya, L., Puy, H., Robreau, A.M., Bourgeois, M., Lamoril, J., Da Silva, V., Grandchamp, B., Deybach, J.C. (2002) The penetrance of dominant erythropoietic protoporphyria is modulated by expression of wildtype FECH Nature Genet, . 30, 2728[CrossRef][ISI][Medline] .
- Královi
ová, J., Gaunt, T.R., Rodriguez, S., Wood, P.J., Day, I.N.M., Vo
echovsk
, I. (2006) Variants in the human insulin gene that affect pre-mRNA splicing: is-23HphI a functional single nucleotide polymorphism at IDDM2? Diabetes, 55, 260264[Abstract/Free Full Text] . - Roca, X., Sachidanandam, R., Krainer, A.R. (2003) Intrinsic differences between authentic and cryptic 5' splice sites Nucleic Acids Res, . 31, 63216333
[Abstract/Free Full Text] . - Královi
ová, J., Christensen, M.B., Vo
echovsk
, I. (2005) Biased exon/intron distribution of cryptic and de novo 3' splice sites Nucleic Acids Res, . 33, 48824898[Abstract/Free Full Text] . - Berglund, J.A., Chua, K., Abovich, N., Reed, R., Rosbash, M. (1997) The splicing factor BBP interacts specifically with the pre-mRNA branchpoint sequence UACUAAC Cell, 89, 781787[CrossRef][ISI][Medline] .
- Ruskin, B., Zamore, P.D., Green, M.R. (1988) A factor, U2AF, is required for U2 snRNP binding and splicing complex assembly Cell, 52, 207219[CrossRef][ISI][Medline] .
- Singh, R., Valcárcel, J., Green, M.R. (1995) Distinct binding specificities and functions of higher eukaryotic polypyrimidine tract-binding proteins Science, 268, 11731176
[Abstract/Free Full Text] . - Merendino, L., Guth, S., Bilbao, D., Martinez, C., Valcárcel, J. (1999) Inhibition of msl-2 splicing by Sex-lethal reveals interaction between U2AF35 and the 3' splice site AG Nature, 402, 838841[CrossRef][Medline] .
- Wu, S., Romfo, C.M., Nilsen, T.W., Green, M.R. (1999) Functional recognition of the 3' splice site AG by the splicing factor U2AF35 Nature, 402, 832835[CrossRef][Medline] .
- Zorio, D.A. and Blumenthal, T. (1999) Both subunits of U2AF recognize the 3' splice site in Caenorhabditis elegans Nature, 402, 835838[CrossRef][Medline] .
- Reed, R. (1989) The organization of 3' splice-site sequences in mammalian introns Genes Dev, . 3, 21132123
[Abstract/Free Full Text] . - Smith, C.W., Chu, T.T., Nadal-Ginard, B. (1993) Scanning and competition between AGs are involved in 3' splice site selection in mammalian introns Mol. Cell. Biol, . 13, 49394952
[Abstract/Free Full Text] . - Reed, R. and Maniatis, T. (1988) The role of the mammalian branchpoint sequence in pre-mRNA splicing Genes Dev, . 2, 12681276
[Abstract/Free Full Text] . - Kol, G., Lev-Maor, G., Ast, G. (2005) Human-mouse comparative analysis reveals that branch-site plasticity contributes to splicing regulation Hum. Mol. Genet, . 14, 15591568
[Abstract/Free Full Text] . - Gooding, C., Clark, F., Wollerton, M., Grellscheid, S.-N., Groom, H., Smith, C.W. (2006) A class of human exons with predicted distant branch points revealed by analysis of AG dinucleotide exclusion zones Genome Biol, . 7, R1[CrossRef][Medline] .
- Shapiro, M.B. and Senapathy, P. (1987) RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression Nucleic Acids Res, . 15, 71557174
[Abstract/Free Full Text] . - Senapathy, P., Shapiro, M.B., Harris, N.L. (1990) Splice junctions, branch point sites, and exons: sequence statistics, identification, and applications to genome project Methods Enzymol, . 183, 252278[ISI][Medline] .
- Brunak, S., Engelbrecht, J., Knudsen, S. (1991) Prediction of human mRNA donor and acceptor sites from the DNA sequence J. Mol. Biol, . 220, 4965[CrossRef][ISI][Medline] .
- Reese, M.G., Eeckman, F.H., Kulp, D., Haussler, D. (1997) Improved splice site detection in Genie J. Comput. Biol, . 4, 311323[ISI][Medline] .
- Rogan, P.K., Faux, B.M., Schneider, T.D. (1998) Information analysis of human splice site mutations Hum. Mutat, . 12, 153171[CrossRef][ISI][Medline] .
- Yeo, G. and Burge, C.B. (2004) Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals J. Comput. Biol, . 11, 377394[CrossRef][ISI][Medline] .
- Thanaraj, T.A. (2000) Positional characterisation of false positives from computational prediction of human splice sites Nucleic Acids Res, . 28, 744754
[Abstract/Free Full Text] . - Hiller, M., Huse, K., Szafranski, K., Jahn, N., Hampe, J., Schreiber, S., Backofen, R., Platzer, M. (2004) Widespread occurrence of alternative splicing at NAGNAG acceptors contributes to proteome plasticity Nature Genet, . 36, 12551257[CrossRef][ISI][Medline] .
- Hiller, M., Huse, K., Szafranski, K., Jahn, N., Hampe, J., Schreiber, S., Backofen, R., Platzer, M. (2006) Single-nucleotide polymorphisms in NAGNAG acceptors are highly predictive for variations of alternative splicing Am. J. Hum. Genet, . 78, 291302[CrossRef][ISI][Medline] .
- Bendig, I., Mohr, N., Krämer, F., Weber, B.H. (2004) Identification of novel TP53 mutations in familial and sporadic cancer cases of German and Swiss origin Cancer Genet. Cytogenet, . 154, 2226[CrossRef][ISI][Medline] .
- Newman, P.J., Seligsohn, U., Lyman, S., Coller, B.S. (1991) The molecular genetic basis of Glanzmann thrombasthenia in the Iraqi-Jewish and Arab populations in Israel Proc. Natl Acad. Sci. USA, 88, 31603164
[Abstract/Free Full Text] . - Eng, L., Coutinho, G., Nahas, S., Yeo, G., Tanouye, R., Babaei, M., Dork, T., Burge, C., Gatti, R.A. (2004) Nonclassical splicing mutations in the coding and noncoding regions of the ATM gene: maximum entropy estimates of splice junction strengths Hum. Mutat, . 23, 6776[CrossRef][ISI][Medline] .
- Chen, L.L., Sabripour, M., Wu, E.F., Prieto, V.G., Fuller, G.N., Frazier, M.L. (2005) A mutation-created novel intra-exonic pre-mRNA splice site causes constitutive activation of KIT in human gastrointestinal stromal tumors Oncogene, 24, 42714280[CrossRef][ISI][Medline] .
- Hovnanian, A., Rochat, A., Bodemer, C., Petit, E., Rivers, C.A., Prost, C., Fraitag, S., Christiano, A.M., Uitto, J., Lathrop, M., et al. (1997) Characterization of 18 new mutations in COL7A1 in recessive dystrophic epidermolysis bullosa provides evidence for distinct molecular mechanisms underlying defective anchoring fibril formation Am. J. Hum. Genet, . 61, 599610[ISI][Medline] .
- Abramowicz, M.J., Targovnik, H.M., Varela, V., Cochaux, P., Krawiec, L., Pisarev, M.A., Propato, F.V., Juvenal, G., Chester, H.A., Vassart, G. (1992) Identification of a mutation in the coding sequence of the human thyroid peroxidase gene causing congenital goiter J. Clin. Invest, . 90, 12001204[ISI][Medline] .
- Ejima, Y., Yang, L., Sasaki, M.S. (2000) Aberrant splicing of the ATM gene associated with shortening of the intronic mononucleotide tract in human colon tumor cell lines: a novel mutation target of microsatellite instability Int. J. Cancer, 86, 262268[CrossRef][ISI][Medline] .
- Boot, R.G., Renkema, G.H., Verhoek, M., Strijland, A., Bliek, J., de Meulemeester, T.M., Mannens, M.M., Aerts, J.M. (1998) The human chitotriosidase gene. Nature of inherited enzyme deficiency J. Biol. Chem, . 273, 2568025685
[Abstract/Free Full Text] . - Webb, J.C., Patel, D.D., Shoulders, C.C., Knight, B.L., Soutar, A.K. (1996) Genetic variation at a splicing branch point in intron 9 of the low density lipoprotein (LDL)-receptor gene: a rare mutation that disrupts mRNA splicing in a patient with familial hypercholesterolaemia and a common polymorphism Hum. Mol. Genet, . 5, 13251331
[Abstract/Free Full Text] . - Ohno, K., Tsujino, A., Shen, X.M., Milone, M., Engel, A.G. (2005) Spectrum of splicing errors caused by CHRNE mutations affecting introns and intron/exon boundaries J. Med. Genet, . 42, e53
[Abstract/Free Full Text] . - Fisher, C.W., Lau, K.S., Fisher, C.R., Wynn, R.M., Cox, R.P., Chuang, D.T. (1991) A 17-bp insertion and a Phe215Cys missense mutation in the dihydrolipoyl transacylase (E2) mRNA from a thiamine-responsive maple syrup urine disease patient WG-34 Biochem. Biophys. Res. Commun, . 174, 804809[CrossRef][ISI][Medline] .
- Li, S.S., Tseng, H.M., Yang, T.P., Liu, C.H., Teng, S.J., Huang, H.W., Chen, L.M., Kao, H.W., Chen, J.H., Tseng, J.N., et al. (1999) Molecular characterization of germline mutations in the BRCA1 and BRCA2 genes from breast cancer families in Taiwan Hum. Genet, . 104, 201204[CrossRef][ISI][Medline] .
- Stasia, M.J., Bordigoni, P., Martel, C., Morel, F. (2002) A novel and unusual case of chronic granulomatous disease in a child with a homozygous 36-bp deletion in the CYBA gene (A220) leading to the activation of a cryptic splice site in intron 4 Hum. Genet, . 110, 444450[CrossRef][ISI][Medline] .
- Podkraj
ek, K.T., Bratani
, N., Kr
i
nik, C., Battelino, T. (2005) Autoimmune regulator-1 messenger ribonucleic acid analysis in a novel intronic mutation and two additional novel AIRE gene mutations in a cohort of autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy patients J. Clin. Endocrinol. Metab, . 90, 49304935[Abstract/Free Full Text] . - Smyth, I., Wicking, C., Wainwright, B., Chenevix-Trench, G. (1998) The effects of splice site mutations in patients with naevoid basal cell carcinoma syndrome Hum. Genet, . 102, 598601[CrossRef][ISI][Medline] .
- Cooper, D.N. and Krawczak, M. Human Gene Mutation, (1993) Oxford BIOS Scientific Publishers .
- Krawczak, M. and Cooper, D.N. (1996) Single base-pair substitutions in pathology and evolution: two sides to the same coin Hum. Mutat, . 8, 2331[CrossRef][ISI][Medline] .
- Parker, R. and Siliciano, P.G. (1993) Evidence for an essential non-WatsonCrick interaction between the first and last nucleotides of a nuclear pre-mRNA intron Nature, 361, 660662[CrossRef][Medline] .
- Dietrich, R.C., Fuller, J.D., Padgett, R.A. (2005) A mutational analysis of U12-dependent splice site dinucleotides RNA, 11, 14301440
[Abstract/Free Full Text] . - Deirdre, A., Scadden, J., Smith, C.W. (1995) Interactions between the terminal bases of mammalian introns are retained in inosine-containing pre-mRNAs EMBO J, . 14, 32363246[ISI][Medline] .
- Gaur, R.K., Beigelman, L., Haeberli, P., Maniatis, T. (2000) Role of adenine functional groups in the recognition of the 3'-splice-site AG during the second step of pre-mRNA splicing Proc. Natl Acad. Sci. USA, 97, 115120
[Abstract/Free Full Text] . - Weaving, L.S., Christodoulou, J., Williamson, S.L., Friend, K.L., McKenzie, O.L., Archer, H., Evans, J., Clarke, A., Pelka, G.J., Tam, P.P., et al. (2004) Mutations of CDKL5 cause a severe neurodevelopmental disorder with infantile spasms and mental retardation Am. J. Hum. Genet, . 75, 10791093[CrossRef][ISI][Medline] .
- Bonnevie-Nielsen, V., Leigh Field, L., Lu, S., Zheng, D.J., Li, M., Martensen, P.M., Nielsen, T.B., Beck-Nielsen, H., Lau, Y.L., Pociot, F. (2005) Variation in antiviral 2',5'-oligoadenylate synthetase (2'5'AS) enzyme activity is controlled by a single-nucleotide polymorphism at a splice-acceptor site in the OAS1 Gene Am. J. Hum. Genet, . 76, 623633[CrossRef][ISI][Medline] .
- Chavanas, S., Bodemer, C., Rochat, A., Hamel-Teillac, D., Ali, M., Irvine, A.D., Bonafe, J.L., Wilkinson, J., Taieb, A., Barrandon, Y., et al. (2000) Mutations in SPINK5, encoding a serine protease inhibitor, cause Netherton syndrome Nature Genet, . 25
