Skip Navigation

Nucleic Acids Research 2006 34(Database Issue):D46-D55; doi:10.1093/nar/gkj031
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (6187K) Freely available
Right arrow Screen PDF (778K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Stamm, S.
Right arrow Articles by Thanaraj, T. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Stamm, S.
Right arrow Articles by Thanaraj, T. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2006, Vol. 34, Database issue D46-D55
© The Author 2006. Published by Oxford University Press. All rights reserved
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions{at}oxfordjournals.org


Article

ASD: a bioinformatics resource on alternative splicing

Stefan Stamm1, Jean-Jack Riethoven, Vincent Le Texier, Chellappa Gopalakrishnan, Vasudev Kumanduri, Yesheng Tang1, Nuno L. Barbosa-Morais2 and Thangavel Alphonse Thanaraj*

European Bioinformatics Institute, Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK 1University of Erlangen, Institute for Biochemistry Fahrstrasse 17, 91054 Erlangen, Germany 2Faculty of Medicine, Institute of Molecular Medicine, University of Lisbon 1649-028 Lisbon, Portugal

*To whom correspondence should be addressed. Tel: +44 1223 494650; Fax: +44 1223 494468; Email: thanaraj{at}ebi.ac.uk

Received August 23, 2005. Revised September 22, 2005. Accepted September 22, 2005.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 ASD
 DATA INTEGRATION
 ACCESS TO DATABASES
 ASD WORKBENCH
 SUMMARY OF UPDATES AND...
 CONCLUSIONS
 REFERENCES
 
Alternative splicing is an important regulatory mechanism of mammalian gene expression. The alternative splicing database (ASD) consortium is systematically collecting and annotating data on alternative splicing. We present the continuation and upgrade of the ASD [T. A. Thanaraj, S. Stamm, F. Clark, J. J. Riethoven, V. Le Texier, J. Muilu (2004) Nucleic Acids Res. 32, D64–D69] that consists of computationally and manually generated data. Its largest parts are AltSplice, a value-added database of computationally delineated alternative splicing events. Its data include alternatively spliced introns/exons, events, isoform splicing patterns and isoform peptide sequences. AltSplice data are generated by examining gene-transcript alignments. The data are annotated for various biological features including splicing signals, expression states, (SNP)-mediated splicing and cross-species conservation. AEdb forms the manually curated component of ASD. It is a literature-based data set containing sequence and properties of alternatively spliced exons, functional enumeration of observed splicing events, characterization of observed splicing regulatory elements, and a collection of experimentally clarified minigene constructs. ASD includes a workbench, which is an analysis tool that enables users to carry out splicing related analysis such as characterization of introns for various splicing signals, identification of splicing regulatory elements on a given RNA sequence, prediction of putative exons and prediction of putative translation start codons. The different ASD modules are integrated and can be accessed through user-friendly interfaces and visualization tools. ASD data has been integrated with Ensembl genome annotation project as a Distributed Annotation System (DAS) resource and can be viewed on Ensembl genome browser. The ASD resource is presented at (http://www.ebi.ac.uk/asd).


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 ASD
 DATA INTEGRATION
 ACCESS TO DATABASES
 ASD WORKBENCH
 SUMMARY OF UPDATES AND...
 CONCLUSIONS
 REFERENCES
 
Alternative pre-mRNA splicing is emerging as one of the most important mechanisms to control eukaryotic gene expression. Recent array data indicate that as much as 76% of genes generate alternatively spliced products (1). Alternative splicing regulates numerous aspects of protein function, such as binding properties, intracellular localization, enzymatic activity, stability and post-translational modifications. Reports in literature indicate that protein isoforms generated by alternative splicing show in most cases only subtle differences. However, in some cases alternative splicing can lead to large functional differences, e.g. by generating dominant negative isoforms (2). Finally, 10–15% of genes could be switched off due to the coupling between nonsense-mediated decay and alternative splicing (3). This indicates that alternative splicing controls both transcript composition and abundance.

Despite intense research, the mechanisms leading to splice site selection are not fully understood. Currently, it is not possible to accurately predict alternative exons from genomic sequences. It is not at all possible to predict the tissue or developmental expression profile of an alternative exon. The major obstacle for an accurate prediction is the lack of conservation in the regulatory sequences of the pre-mRNA that can only be described by consensus sequences or expectation matrices. However, in vivo, alternative exons are recognized and regulated with high fidelity, because numerous proteins bind to pre-mRNA and help in exon recognition. Due to the combination of multiple weak protein–protein and protein–RNA interactions, alternative exons can be faithfully recognized (4,5). The importance of proper splicing site recognition is underlined by an increasing number of diseases that are caused or associated with the selection of wrong splicing sites (6,7).

Alternative splicing events have been compiled previously in different databases [reviewed in (8)]. Here, we present the continuation and upgrade of the alternative splicing database (ASD) [the previous version was reported earlier in (9)] as a bioinformatics resource that integrates data on alternative splicing, derived from computational as well as literature based approaches, and bioinformatics analysis tools.


    ASD
 TOP
 ABSTRACT
 INTRODUCTION
 ASD
 DATA INTEGRATION
 ACCESS TO DATABASES
 ASD WORKBENCH
 SUMMARY OF UPDATES AND...
 CONCLUSIONS
 REFERENCES
 
The different component data sets of ASD
Data generated from computational approaches and data reported in the literature are the two major sources for databases on alternative splicing. The major advantage of data from computational approaches, such as EST comparison, is the large size of the data sets. However, these data sets lack biological information about the alternative splicing events. In contrast, data sets derived from the literature contain biologically relevant information, but these data sets are smaller. In order to combine these two approaches, we carried out two activities namely, (i) we developed a computational pipeline (AltSplice) that generates genome-wide value-added data on alternative exons, and (ii) we developed a procedure (AEdb) that manually collects data on alternative exons from literature. In addition, data on motifs, functions and minigenes described in the literature were collected in databases. We then built an integrated database (Figure 1) from these heterogeneous data resources. The integrated database is named ASD for which we developed query interfaces that are flexible enough to handle the heterogeneity. A single-query bar provides a quick access to all of ASD data, allowing retrieval of data using keyword search or sequence comparison searches. In addition, each data set can be queried using a data set-specific interface. Finally, all data sets can be downloaded as flat file distributions.



View larger version (58K):
[in this window]
[in a new window]
 
Figure 1 Structure of ASD. We used computational pipelines and manual curation (top, pink) to create the modular databases of ASD (middle, blue). The individual databases are integrated, cross-linked and are available through a variety of interface tools (bottom, sky blue). Currently the databases are the computer generated Altsplice and the manually-curated AEdb-Sequence, AEdb-Motif, AEdb-Function and AEdb-Minigene databases. The ASD data are integrated with Ensembl genome annotation system and is visible from Ensembl genome browser; Publicly-available databases on alternative splicing are accessible from ASD interfaces (top right, dark green) The databases are connected to analysis tools that are collected in the ASD workbench (middle right, grey).

 
The statistics on different data sets of the ASD are listed in Table 1.


View this table:
[in this window]
[in a new window]
 
Table 1 Statistics on ASD data

 
AltSplice
AltSplice data are generated by an automated computational pipeline. The basal data includes transcript-confirmed introns/exons, alternative splicing events and isoform splicing patterns. The basal data are generated through computational comparison of EST/mRNA alignments with genomic sequences [see (9,10) for details on the computational methods]. The data are annotated for biological features (such as splicing site characteristics, expression state of the isoforms, allele usage at SNP positions, conservation of intron/exon-events across species and peptide isoforms) by various computational modules that are part of AltSplice pipeline. AltSplice data (Table 1) indicates that up to 61% of human genes (and 50% of mouse genes) undergo alternative splicing. The available transcriptome data indicates that in an average 3.9 isoform splicing patterns can be expressed from a single human gene. Cassette exon events outnumber the other event types and one in every four cassette exon events is accompanied by extension/truncation of the flanking exons. The number of human-mouse orthologous gene pairs (with data on alternative splicing) present in AltSplice is around 5200, and such a large data set is valuable for studies on evolution of alternative splicing. Finally, AltSplice presents data on isoform peptide sequences for around 4000 human genes, and such a large data set of variant peptides is valuable for studies aimed at deciphering splicing mediated functional and structural changes in proteins.

AEdb-Sequence
AEdb-Sequence is a literature based, manually curated database of alternative exons. We used ‘alternative splicing’ as a keyword to search PubMed bibliography data and collected information on the following features from the resultant research articles: organism, splicing mechanism, tissue-specificity, regulation during development stages, disease association, regulatory features of the exon and the sequence of the alternatively spliced exon as well as its flanking constitutive exons. It is seen that more than half the number of AEdb-Sequence entries are from human (Table 1). As is in the case of AltSplice data, cassette exon events outnumber other event types. The data set reports splicing events that are specific to cellular states, such as tissue type, development stage and disease state. Roughly 10% of the entries report events that introduce premature stop codons and this data set can serve the studies on nonsense mediated decay of transcripts. Finally, 10% of the reported exons are from non-coding regions of the genes.

AEdb-Function
The function database is a literature based, manually curated database of known functions of the alternative exons. Functional differences between the protein isoforms generated by alternative splicing are enumerated from the literature and are organized into 11 well-defined categories, such as ‘Modulation of protein interaction’ or ‘Internal structural change’ (Table 1). An analysis of the function of alternative exons based on this data set has been published previously (2).

AEdb-Motif
Alternative splicing site selection is partially regulated by weak binding of proteins to highly degenerate regulatory sequences. As a first attempt to understand the combinatorial control behind this regulation, we collected splicing regulatory motifs described in literature and expanded upon the previous collections of intronic regulatory sequences (11), exonic regulatory sequences (12,13) and disease-causing mutations (6). The collection reports 153 enhancer sequences and 81 silencer sequences (Table 1). The entries are annotated with value-added information, such as the experimental technique used, the nucleotide sequence of the motif, mutations that are studied and the protein that binds at the motif.

AEdb-Minigenes
A minigene is a genomic fragment that includes the alternative exon and the surrounding introns as well as the flanking constitutively spliced exons. Constructs derived by cloning the insert in an eukaryotic expression vector are increasingly used to study alternative splicing (14,15). We compiled all minigenes described in the literature. The splicing patterns and deduced regulatory sequences are represented in a graphic format. The minigene collection includes 82 entries for which a total of 97 regulatory sequences are ascribed. The reported minigene constructs representing cassette exon events outnumber those for other event types (Table 1). The minigene entries are linked to appropriate entries in AEdb-Sequence data set, which allows the user to quickly identify experimentally useful minigenes by searching the database.


    DATA INTEGRATION
 TOP
 ABSTRACT
 INTRODUCTION
 ASD
 DATA INTEGRATION
 ACCESS TO DATABASES
 ASD WORKBENCH
 SUMMARY OF UPDATES AND...
 CONCLUSIONS
 REFERENCES
 
Integration of data across the different data sets of ASD
Extensive integration has been made between AltSplice and AEdb-Sequence. Alternative exons and events that are common in AltSplice and AEdb-Sequence are identified and are annotated in both the databases. This allows associating the manually-collected annotations to the computationally generated data in AltSplice. Related entries among AEdb-Sequence, -Function and -Minigenes are identified and are annotated. Table 1 shows that from the 1700 AEdb-Sequence entries, as much as 1200 are associated with AltSplice entries; and as much as 78 of 82 entries from AEdb-Minigenes data set are associated with AEdb-Sequence data set.

Integration with other resources
Ensembl (16) and UniProt (17) are among the most important resources on sequence data. Both include significant data relating to alternative splicing, e.g. Ensembl reports alternative transcripts while UniProt reports curated data on isoform peptide sequences. AltSplice uses Ensembl genes as the starting gene set for deriving splicing patterns. Therefore, the AltSplice data are intrinsically associated with Ensembl annotation of alternate transcripts and related information. Data on peptide variants collected in UniProt are integrated with AltSplice and they complement the set of AltSplice-derived peptide isoform data.


    ACCESS TO DATABASES
 TOP
 ABSTRACT
 INTRODUCTION
 ASD
 DATA INTEGRATION
 ACCESS TO DATABASES
 ASD WORKBENCH
 SUMMARY OF UPDATES AND...
 CONCLUSIONS
 REFERENCES
 
ASD interfaces
ASD contains heterogeneous data sets—AltSplice is created through computational analysis of gene-transcript alignments, and AEdb is created by manual curation of literature data. Furthermore, there are differences in the extent of annotation and in the adopted vocabularies. We therefore generated interfaces that are flexible enough to handle this heterogeneity and that allow an easy retrieval of information. Different layers of interfaces are available and they provide either a single-point access to all the data sets or advanced searches of individual data sets.

Single-query bar and wrapper interfaces
Both these interfaces provide a quick access to all of ASD data. The single-query bar accepts commonly-used search terms, such as keywords, gene symbols (or their synonyms) and database cross-references. The wrapper interface queries all of ASD data against a given search term; these terms include the above-mentioned commonly used terms and splicing event type. Queries can be selectively restricted to specific sets of gene entries, such as set of human-mouse orthologous gene pairs or set of gene entries for which data on isoform peptide sequences is available or integrated set of AltSplice-AEdb entries.

Advanced search query interfaces
The individual data sets differ in terms of the type of data and annotation—e.g. AltSplice reports splicing events, splicing patterns and introns/exons while AEdb-Motif reports splicing regulatory sequences. Thus specialized query interfaces have been built for individual data sets.

Interface for AltSplice
Genes can be queried by chromosomal location, gene names and synonyms, protein keywords and database cross-references [such as EMBL (18) and UniProt accession nos (17), HUGO gene symbols (19), Gene Ontology identifiers (20) and protein identifiers], types of splicing events and allele specificity at SNP positions. Browsers allowing selection of eVOC standard vocabularies for EST library annotation (21) and GO classifiers facilitate querying through expression states and through protein function/process/location. Queries can be selectively restricted to specific sets of gene entries, such as to the set of human-mouse orthologous gene pairs, or to the set of gene entries for which data on isoform peptide sequences is available, or to the integrated set of AltSplice-AEdb entries. A particularly useful query for experimentalists is ‘Library Subtraction Tool’, which let users retrieve gene entries with splicing patterns that are differentially expressed in different cell states.

Interface for AEdb-Sequence
The data can be queried by gene names and synonyms, database cross-references, type of splicing events and type of regulatory roles, such as introducing premature termination codons or frameshifts. Further, the data can be queried for disease association and developmental specificity.

Interface for AEdb-Function
The data can be queried by gene names, protein keywords and database cross-references. Further, queries based on the functional enumeration of the isoform peptide sequence can be raised by selecting from a predefined list of functional categories (for the list of functional categories see Table 1).

Interface for AEdb-Motif
The interface allows free-text search. The search items include gene names, sequence of the regulatory motifs and type of regulatory sequence (enhancer or silencer).

BLAST and FASTA searches to ASD
The nucleotide and peptide sequences from ASD can be searched through both BLAST (WU-BLAST2) (http://blast.wustl.edu) and FASTA utilities (22). The objective of these search programs is to identify sequence similarities between novel sequences and alternatively spliced sequences collected in ASD. BLAST reports regions of high similarity. FASTA can be very useful to identify long regions of low similarity between highly diverged sequences. Further, FASTA is helpful when the query sequence is short, since BLAST usually fails to report results for short query sequences.

One-stop query system to access publicly available databases on alternative splicing
Several databases on alternative splicing are publicly available. To facilitate extraction of all known information about splicing of a gene, we generated a single interface that queries various databases simultaneously. Presently, seven alternative splicing databases namely, ASD, ASG, PALS, SpliceInfo, MAASE and HASDB (2327) are made available from this interface. The interface accepts typical search terms (such as keywords, gene names and cross-references) and queries all these databases. The results that are obtained from the individual database servers are presented with hyperlinks to the individual databases.

Example of data search and of data content
Figures 2 and 3 illustrate the results of searching the database for all entries of tra2-alpha and tra2-beta, two important splicing regulators. Querying the ASD using wrapper query interface for ‘tra2a* | tra2b*’ as keyword produces an output page (Figure 2) that lists entries from the different component data sets. As can be seen from the figure, related data entries across the different data sets are hyperlinked to one another. Figure 3a and b illustrate the presentation of some of the data items from AltSplice for tra2-beta; Figure 3a shows sections on gene information, on evidences for alternative splicing of tra2-beta, and on the observed splicing events in AltSplice. Figure 3b shows splicing patterns presented in textual form (as Splice Pattern Table) and in graphical form (as Splice Pattern view). Individual data items in the display page are hyperlinked to pages that list detailed information. For example, the splicing pattern entry number in the table is linked to a page that lets the user to perform multiple alignments on the sequences of all the observed isoform splicing patterns or peptides. Figure 3c shows the display page of tra2-beta entry from AEdb-Function data set; it illustrates the wealth of information that is captured from published literature.



View larger version (53K):
[in this window]
[in a new window]
 
Figure 2 Result page of query to all of ASD data. The ASD was queried using the wrapper interface with the term ‘tra2a* | tra2b*’ and this resulted in the retrieval of data entries from AltSplice-Human (two entries), AEdb-Sequence (seven entries), AEdb-Function (one entry) and AEdb-Motif (five entries). This figure illustrates the integration among the different data sets of ASD—(i) the AltSplice-Human entries are seen associated with entries from AEdb-Sequence, and from AltSplice-Mouse; (ii) the AEdb-Sequence entries are seen associated with entries from AltSplice-Human. These associations are hyperlinked.

 



View larger version (77K):
[in this window]
[in a new window]
 
Figure 3 Display of data on alternative splicing of human tra2-beta gene as seen in AltSplice and AEdb-Function data sets. (a) This figure presents a display of data sections on Gene Information, Evidences and Splicing events as seen in AltSplice. Gene information section provides hyperlinks to a page listing the gene entry from HUGO Gene Nomenclature database and to a page which lists the sequence of the gene. The evidences section provides hyperlinks to the associated entries from AEdb-Sequence, to pages that list variant peptide sequences for the gene from UniProt or to pages that list the Ensembl transcript sequences for the gene. The events section lists all the splicing events that AltSplice has identified for the gene. Column 1 lists the gene coordinates of alternatively spliced exons/introns. Column 2 indicates whether the event involves modifications in the flanking exons as well; entries are hyperlinked to pages listing detailed information on the event. Column 3 indicates the identifier of the associated entry from AEdb-Sequence (if any) and the entry is hyperlinked. Column 4 indicates the identifier of the orthologous gene (if any) and the coordinates of the exon orthologous to the one presented in column 1; the entry is hyperlinked to the orthologous gene entry. (b) This figure presents the textual and graphical display of observed splicing patterns for tra2-beta gene as seen in AltSplice data. Splice Pattern Table: Entry in column 1 is hyperlinked to a page listing the sequence of the splicing pattern. Entry in column 2 gives the coding start and end positions on the gene and the length of the translated peptide sequence and is hyperlinked to a page listing the peptide sequence. Entry in column 3 lists the structure of the splicing pattern as a string of exons (exon boundaries are presented in gene coordinates). Entry in column 4 is hyperlinked to pages listing detailed information on the confirming transcript sequences. Entry in column 5 is hyperlinked to pages listing expression states. Entry in column 6 is hyperlinked to pages listing allele specificity of the splicing pattern. Splice Pattern View: Exons are indicated by boxes and introns by lines. Exons/introns that are variants are indicated in blue color. Browsing the cursor over various elements of the pattern displays pop-up's giving detailed information on the elements. The displayed pop-up in this example shows information on the expression state of Splicing Pattern 6. (c) This figure presents data on the functional changes due to alternative splicing in tra2-beta as seen in AEdb-Function data set. The data are organised into three sections namely, gene information, bibliography information and functional information. This figure illustrates the wealth of knowledge captured from literature.

 

    ASD WORKBENCH
 TOP
 ABSTRACT
 INTRODUCTION
 ASD
 DATA INTEGRATION
 ACCESS TO DATABASES
 ASD WORKBENCH
 SUMMARY OF UPDATES AND...
 CONCLUSIONS
 REFERENCES
 
The workbench provides a set of online tools that enable users to carry out analysis of pre-mRNA sequences. It includes tools for intron analysis, scoring ATG-context sequence, finding exons and identifying splicing regulatory sequences. These tools are accessed either through a single wrapper interface or through interfaces that are specialised for individual tools.

Intron analysis
The tool examines intron sequences (as provided by the user) for putative branch point (BP) sites and polypyrimidine tracts (PPT). It further calculates the strength of the donor and acceptor sites. The methods are described elsewhere (10). The user has a choice of weight matrices for donor and acceptor sites tailored for different intron types, such as U2-type GT-AG and GC-AG and U12-type GT-AG and AT-AC.

Scoring ATG-context sequence
This tool examines each occurrence of ATG in a given transcript sequence for its ability to act as translation start codon. Each ATG is scored for Kozak's ATG-context sequence (28) using a weight matrix that we built from experimentally confirmed translation initiation sites. The sequence of translated peptide from each occurrence of ATG is presented along with the ATG-context score. FASTA/BLAST searches against UniProt sequence data can be launched for each of the translated peptide sequence.

MZEF-SPC exon finder
This tool identifies potential exons in a given nucleotide sequence. It integrates Michael Zhang's Exon Finder (29) and Thanaraj's SpliceProximalCheck (30). MZEF identifies putative exons using quadratic discriminant analysis. SPC is a decision tree implementation of splicing signals that differentiate genuine human splicing sites from the proximal false sites and thus specialises in validating the predicted exon boundary for exactness.

Detection of short regulatory sequences
Exons are regulated by short, degenerative sequences that bind to interacting splicing factors and proteins. These sequences are collected in the AEdb-Motif database. The Regulatory Sequence tool uses these motifs to examine a given nucleotide sequence for their presence. Users have a choice to specify the extent of allowed mismatches. The identified motifs are hyperlinked to the corresponding entries in AEdb-Motif database (See Figure 4 for illustration of identified motifs in tra2-beta gene). The splicing rainbow is a visualizations tool that colour-codes presence of different regulatory motifs in a user-supplied sequence.



View larger version (42K):
[in this window]
[in a new window]
 
Figure 4 Example output from workbench tool that detects splicing regulatory sequences. The figure displays a portion of the page that reports splicing regulatory sequences in tra2-beta gene. The names of identified motifs are hyperlinked to appropriate entries in AEdb-Motif database. Matches against tra2-beta regulatory sequences are also seen. It is known that tra2-beta1 auto regulates its protein concentration by influencing alternative splicing of its pre-mRNA (31).

 

    SUMMARY OF UPDATES AND ENHANCEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 ASD
 DATA INTEGRATION
 ACCESS TO DATABASES
 ASD WORKBENCH
 SUMMARY OF UPDATES AND...
 CONCLUSIONS
 REFERENCES
 
The current release of ASD includes a large number of improvements over that reported earlier (9). AltSplice (the production pipeline) supersedes AltExtron (the research and development pipeline) in functionalities, data content, integrations and data presentations. As a result, AltExtron has become redundant and is not maintained any further; however for archival purposes, the earlier versions of AltExtron data are still presented in ASD web pages. Examples of enhancements in AltSplice data content include data on evidences for alternative splicing and isoform peptide sequences; those in integrations include related data from UniProt and Ensembl; those in query tools include differential library expression profiler; and those in presentation of data include a complete redesign of both the query and data (textual and graphical) results pages and multiple alignment view of isoform splice pattern and peptide sequences. In addition to the AEdb-Sequence data set presented in (9), further AEdb data sets (namely AEdb-Function, AEdb-Motif and AEdb-Minigenes) are presented. The ASD workbench is a new addition that complements the data content. The ASD sequence data are now available for search through BLAST and FASA tools. The current version of ASD provides three levels of search facilities, namely single-query bar, wrapper and data set specific advanced search query page. Further, a one-stop query system that enables users to query many publicly-available databases (including all the data sets of ASD) on alternative splicing is now provided. Integrations with splice variant data from UniProt and Ensembl enhance the value of ASD resources; AltSplice data are presented on Ensembl genome annotation browsers as DAS (Distributed Annotation Server) tracks. Since the last report, the ASD pipeline has matured to high production standards. The pipeline and the database are now handled by EBI database-production team which is committed to making regular data releases, to expanding the repertoire of organisms for which the data are made available, and to make seamless integration and hyperlinks both among the different data sets of ASD and to various other external databases.


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 ASD
 DATA INTEGRATION
 ACCESS TO DATABASES
 ASD WORKBENCH
 SUMMARY OF UPDATES AND...
 CONCLUSIONS
 REFERENCES
 
We present here ASD, a bioinformatics resource for alternative splicing. The individual components of the resource are (i) AltSplice—a value-added data set generated by our AltSplice, (ii) AEdb—a manually curated data set of alternatively spliced exons and their properties, and (iii) ASD Workbench—a collection of tools that carry our various analyses on pre-mRNA sequences. These individual components are integrated to one another and to related data from UniProt and Ensembl. The resource also provides a one-stop query system that accesses various other publicly available databases on alternative splicing. The integrated resource is available to the community (from http://www.ebi.ac.uk/asd) through user-friendly interfaces. The future releases will contain data that are being generated by other members of the splicing community, e.g. array expression data on alternative splicing of genes.

General enquiries on the ASD database can be mailed at asd-ebi{at}ebi.ac.uk.


    ACKNOWLEDGEMENTS
 
T.A.T. and S.S. thank European Commission for the ASD grant (QLRT-CT-2001-02062). Involvement of Francis Clark and Juha Muilu in the earlier stages of the project is acknowledged. NLB-M acknowledges Portugal Foundation for Science and Technology for financial support (Fellowship SFRH/BD/2914/2000). Juan Valcarcel is acknowledged for mentoring NLB-M in developing the Splicing Rainbow tool. Funding to pay the Open Access publication charges for this article was provided by the ASD grant from European Commission.

Conflict of interest statement. None declared.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 ASD
 DATA INTEGRATION
 ACCESS TO DATABASES
 ASD WORKBENCH
 SUMMARY OF UPDATES AND...
 CONCLUSIONS
 REFERENCES
 

  1. Johnson, J.M., Castle, J., Garrett-Engele, P., Kan, Z., Loerch, P.M., Armour, C.D., Santos, R., Schadt, E.E., Stoughton, R., Shoemaker, D.D. (2003) Genome-wide survey of human alternative pre-mrna splicing with exon junction microarrays Science, 302, 2141–2144[Abstract/Free Full Text] .

  2. Stamm, S., Ben-Ari, S., Rafalska, I., Tang, Y., Zhang, Z., Toiber, D., Thanaraj, T.A., Soreq, H. (2005) Function of alternative splicing Gene, 344, 1–20[CrossRef][ISI][Medline] .

  3. Lewis, B.P., Green, R.E., Brenner, S.E. (2003) Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans Proc. Natl Acad. Sci. USA, 100, 189–192[Abstract/Free Full Text] .

  4. Smith, C.W. and Valcarcel, J. (2000) Alternative pre-mRNA splicing: the logic of combinatorial control Trends Biochem. Sci, . 25, 381–388[CrossRef][ISI][Medline] .

  5. Caceres, J.F. and Kornblihtt, A.R. (2002) Alternative splicing regulation: multiple control mechanisms and involvement in human diseases Trends Genet, . 18, 186–193[CrossRef][ISI][Medline] .

  6. Stoilov, P., Meshorer, E., Gencheva, M., Glick, D., Soreq, H., Stamm, S. (2002) Defects in pre-mRNA processing as causes and predisposition to diseases DNA Cell Biol, . 21, 803–818[CrossRef][ISI][Medline] .

  7. Faustino, N.A. and Cooper, T.A. (2003) Pre-mRNA splicing and human disease Genes Dev, . 17, 419–437[Free Full Text] .

  8. Thanaraj, T.A. and Stamm, S. (2003) Prediction and statistical analysis of alternatively spliced exons Prog. Mol. Subcell. Biol, . 31, 1–31[Medline] .

  9. Thanaraj, T.A., Stamm, S., Clark, F., Riethoven, J.J., Le Texier, V., Muilu, J. (2004) ASD: the Alternative Splicing Database Nucleic Acids Res, . 32, D64–D69[Abstract/Free Full Text] .

  10. Clark, F. and Thanaraj, T.A. (2002) Categorization and characterization of transcript-confirmed constitutively and alternatively spliced introns and exons from human Hum. Mol. Genet, . 11, 451–464[Abstract/Free Full Text] .

  11. Ladd, A.N. and Cooper, T.A. (2002) Finding signals that regulate alternative splicing in the post-genomic era Genome Biol, . 3, 8.1–8.16 .

  12. Zheng, Z.M. (2004) Regulation of alternative RNA splicing by exon definition and exon sequences in viral and mammalian gene expression J. Biomed. Sci, . 11, 278–294[ISI][Medline] .

  13. Bourgeois, C.F., Lejeune, F., Stevenin, J. (2004) Broad specificity of SR (serine/arginine) proteins in the regulation of alternative splicing of pre-messenger RNA Prog. Nucleic Acid Res. Mol. Biol, . 78, 37–88[ISI][Medline] .

  14. Tang, Y., Novoyatleva, T., Benderska, N., Kishore, S., Thanaraj, T.A., Stamm, S. (2005) Analysis of alternative splicing in vivo using minigenes In Westhof, E., Bindereif, A., Schön, A., Hartmann, K. (Eds.). Handbook of RNA Biochemistry, Verlag, Weinheim Wiley-VCH Vol. 2, pp. 755–782 .

  15. Stoss, O., Stoilov, P., Hartmann, A.M., Nayler, O., Stamm, S. (1999) The in-vivo minigene approach to analyse tissue-specific splicing Brain Res. Protoc, . 4, 383–394[CrossRef][Medline] .

  16. Hubbard, T., Andrews, D., Caccamo, M., Cameron, G., Chen, Y., Clamp, M., Clarke, L., Coates, G., Cox, T., Cunningham, F., et al. (2005) Ensembl 2005 Nucleic Acids Res, . 33, D447–D453[Abstract/Free Full Text] .

  17. Bairoch, A., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., et al. (2005) The universal protein resource (UniProt) Nucleic Acids Res, . 33, D154–D159[Abstract/Free Full Text] .

  18. Kanz, C., Aldebert, P., Althorpe, N., Baker, W., Baldwin, A., Bates, K., Browne, P., van den Broek, A., Castro, M., Cochrane, G., et al. (2005) The EMBL Nucleotide Sequence Database Nucleic Acids Res, . 33, D29–D33[Abstract/Free Full Text] .

  19. Wain, H.M., Lush, M., Ducluzeau, F., Povey, S. (2002) Genew: the human gene nomenclature database Nucleic Acids Res, . 30, 169–171[Abstract/Free Full Text] .

  20. Ashburner, M., Ball, C.A., Blake, J.A., Butler, H., Cherry, J.M., Corradi, J., Dolinski, K., Janan, T., Eppig, J.T., Harris, M., et al. (2001) Creating the Gene Ontology resource: design and implementation Genome Res, . 11, 1425–1433[Abstract/Free Full Text] .

  21. Kelso, J., Visagie, J., Theiler, G., Christoffels, A., Bardien-Kruger, S., Smedley, D., Otgaar, D., Greyling, G., Jongeneel, V., McCarthy, M.I., et al. (2003) eVOC: a controlled vocabulary for gene expression data Genome Res, . 13, 1222–1230[Abstract/Free Full Text] .

  22. Pearson, W.R. and Lipman, D.J. (1988) Improved tools for biological sequence comparison Proc. Natl Acad. Sci. USA, 85, 2444–2448[Abstract/Free Full Text] .

  23. Leipzig, J., Pevzner, P., Heber, S. (2004) The alternative splicing gallery (ASG): bridging the gap between genome and transcriptome Nucleic Acids Res, . 32, 3977–3983[Abstract/Free Full Text] .

  24. Huang, Y.H., Chen, Y.T., Lai, J.J., Yang, S.T., Yang, U.C. (2002) PALS db: putative alternative splicing database Nucleic Acids Res, . 30, 186–190[Abstract/Free Full Text] .

  25. Huang, H.D., Horng, J.T., Lin, F.M., Chang, Y.C., Huang, C.C. (2005) SpliceInfo: an information repository for mRNA alternative splicing in human genome Nucleic Acids Res, . 33, D80–85[Abstract/Free Full Text] .

  26. Zheng, C.L., Nair, T.M., Gribskov, M., Kwon, Y.S., Li, H.R., Fu, X.D. (2004) A database designed to computationally aid an experimental approach to alternative splicing Pac. Symp. Biocomput, . 78–88 .

  27. Modrek, B., Resch, A., Grasso, C., Lee, C. (2001) Genome-wide detection of alternative splicing in expressed sequences of human genes Nucleic Acids Res, . 29, 2850–2859[Abstract/Free Full Text] .

  28. Kozak, M. (1984) Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs Nucleic Acids Res, . 12, 857–872[Abstract/Free Full Text] .

  29. Zhang, M.Q. (2003) Using MZEF to find internal coding exons In Baxevanis, A.D. and Davison, D.B. (Eds.). Current Protocols in Bioinformatics, New York John Wiley & Sons, Inc. Vol. 1, pp. 4.2.1–18 .

  30. Thanaraj, T.A. and Robinson, A. (2000) Prediction of exact boundaries of exons Brief Bioinform, . 1, 343–356[Abstract/Free Full Text] .

  31. Stoilov, P., Daoud, R., Nayler, O., Stamm, S. (2004) Human tra2-beta1 autoregulates its protein concentration by influencing alternative splicing of its pre-mRNA Hum. Mol. Genet, . 13, 509–524[Abstract/Free Full Text] .


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
E. D. Harrington and P. Bork
Sircah: a tool for the detection and visualization of alternative transcripts
Bioinformatics, September 1, 2008; 24(17): 1959 - 1960.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
Y.-T. Huang, F.-C. Chen, C.-J. Chen, H.-L. Chen, and T.-J. Chuang
Identification and analysis of ancestral hominoid transcriptome inferred from cross-species transcript and processed pseudogene comparisons
Genome Res., July 1, 2008; 18(7): 1163 - 1170.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. Castrignano, M. D'Antonio, A. Anselmo, D. Carrabino, A. D'Onorio De Meo, A. M. D'Erchia, F. Licciulli, M. Mangiulli, F. Mignone, G. Pavesi, et al.
ASPicDB: A database resource for alternative splicing analysis
Bioinformatics, May 15, 2008; 24(10): 1300 - 1304.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Ryabov and M. Gribskov
Spontaneous symmetry breaking in genome evolution
Nucleic Acids Res., May 1, 2008; 36(8): 2756 - 2763.
[Abstract] [Full Text] [PDF]


Home page
CarcinogenesisHome page
K. E. Driver, H. Song, F. Lesueur, S. Ahmed, N. L. Barbosa-Morais, J. P. Tyrer, B. A.J. Ponder, D. F. Easton, P. D.P. Pharoah, A. M. Dunning, et al.
Association of single-nucleotide polymorphisms in the cell cycle genes with breast cancer in the British population
Carcinogenesis, February 1, 2008; 29(2): 333 - 341.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Y. Galperin
The Molecular Biology Database Collection: 2008 update
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D2 - D4.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Birzele, R. Kuffner, F. Meier, F. Oefinger, C. Potthast, and R. Zimmer
ProSAS: a database for analyzing alternative splicing in the context of protein structures
Nucleic Acids Res., January 1, 2008; 36(suppl_1): D63 - D68.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. G. Leparc and R. D. Mitra
A sensitive procedure to detect alternatively spliced mRNA in pooled-tissue samples
Nucleic Acids Res., December 18, 2007; 35(21): e146 - e146.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Bhasi, R. V. Pandey, S. P. Utharasamy, and P. Senapathy
EuSplice: a unified resource for the analysis of splice signals and alternative splicing in eukaryotic genes
Bioinformatics, July 15, 2007; 23(14): 1815 - 1823.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Foissac and M. Sammeth
ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W297 - W299.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Z. Lacroix, C. Legendre, L. Raschid, and B. Snyder
BIPASS: BioInformatics Pipeline Alternative Splicing Services
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W292 - W296.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
F.-C. Chen, S.-M. Chaw, Y.-H. Tzeng, S.-S. Wang, and T.-J. Chuang
Opposite Evolutionary Effects between Different Alternative Splicing Patterns
Mol. Biol. Evol., July 1, 2007; 24(7): 1443 - 1446.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. G. Leparc and R. D. Mitra
Non-EST-based prediction of novel alternatively spliced cassette exons with cell signaling function in Caenorhabditis elegans and human
Nucleic Acids Res., May 11, 2007; 35(10): 3192 - 3202.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. J. Dixon, I. C. Eperon, and N. J. Samani
Complementary intron sequence motifs associated with human exon repetition: a role for intragenic, inter-transcript interactions in gene expression
Bioinformatics, January 15, 2007; 23(2): 150 - 155.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Kim, A. V. Alekseyenko, M. Roy, and C. Lee
The ASAP II database: analysis and comparative genomics of alternative splicing in 15 animal species
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D93 - D98.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Lee, Y. Lee, B. Kim, Y. Shin, S. Nam, P. Kim, N. Kim, W.-H. Chung, J. Kim, and S. Lee
ECgene: an alternative splicing database update
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D99 - D103.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. S. Alioto
U12DB: a database of orthologous U12-type spliceosomal introns
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D110 - D115.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Hiller, S. Nikolajewa, K. Huse, K. Szafranski, P. Rosenstiel, S. Schuster, R. Backofen, and M. Platzer
TassDB: a database of alternative tandem splice sites
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D188 - D192.
[Abstract] [Full Text] [PDF]


Home page