Skip Navigation



Nucleic Acids Research Advance Access published online on November 13, 2007

Nucleic Acids Research, doi:10.1093/nar/gkm988
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (9116K) Freely available
Right arrow Screen PDF (1134K) Freely available
Right arrowOA All Versions of this Article:
36/suppl_1/D707    most recent
gkm988v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Citing Articles
Right arrowScopus Links
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Flicek, P.
Right arrow Articles by Searle, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Flicek, P.
Right arrow Articles by Searle, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Database Issue

Ensembl 2008

P. Flicek1,*, B. L. Aken2, K. Beal1, B. Ballester1, M. Caccamo1, Y. Chen1, L. Clarke2, G. Coates2, F. Cunningham2, T. Cutts2, T. Down2, S. C. Dyer2, T. Eyre2, S. Fitzgerald1, J. Fernandez-Banet2, S. Gräf1, S. Haider1, M. Hammond1, R. Holland1, K. L. Howe2, K. Howe2, N. Johnson1, A. Jenkinson1, A. Kähäri1, D. Keefe1, F. Kokocinski2, E. Kulesha1, D. Lawson1, I. Longden1, K. Megy1, P. Meidl1, B. Overduin1, A. Parker2, B. Pritchard2, A. Prlic2, S. Rice2, D. Rios1, M. Schuster1, I. Sealy2, G. Slater1, D. Smedley1, G. Spudich1, S. Trevanion2, A. J. Vilella1, J. Vogel2, S. White2, M. Wood2, E. Birney1, T. Cox2, V. Curwen2, R. Durbin2, X. M. Fernandez-Suarez1, J. Herrero1, T. J. P. Hubbard2, A. Kasprzyk1, G. Proctor1, J. Smith2, A. Ureta-Vidal1 and S. Searle2

1European Bioinformatics Institute (EMBL-EBI) and 2Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK

*To whom correspondence should be addressed. Tel: + 44 1223 492581; Fax: +44 1223 494468; Email: flicek{at}ebi.ac.uk

Received September 15, 2007. Revised October 18, 2007. Accepted October 19, 2007.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 FUTURE DIRECTIONS
 REFERENCES
 
The Ensembl project (http://www.ensembl.org) is a comprehensive genome information system featuring an integrated set of genome annotation, databases and other information for chordate and selected model organism and disease vector genomes. As of release 47 (October 2007), Ensembl fully supports 35 species, with preliminary support for six additional species. New species in the past year include platypus and horse. Major additions and improvements to Ensembl since our previous report include extensive support for functional genomics data in the form of a specialized functional genomics database, genome-wide maps of protein–DNA interactions and the Ensembl regulatory build; support for customization of the Ensembl web interface through the addition of user accounts and user groups; and increased support for genome resequencing. We have also introduced new comparative genomics-based data mining options and report on the continued development of our software infrastructure.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 FUTURE DIRECTIONS
 REFERENCES
 
The availability of complete genome sequences for an increasing number of chordates has had a dramatic impact on biomedical research in the 21st century. Now 7 years beyond the initial publications of the draft human genome sequence (1,2), both the number of sequenced genomes and the total amount of genome-wide data that can be naturally organized on the genome sequence continue to rapidly increase. The Ensembl project provides a comprehensive genome information system consisting of data storage, integration, analysis and visualization of a wide variety of biological data. In comparison to similar projects based at the University of California Santa Cruz (3) and the National Center for Biotechnology Information (4) the distinguishing characteristics of the Ensembl project include:

– The Ensembl genome browser available at http://www.ensembl.org providing visualization for our own and collaborators genome annotations, alignments, variation and functional genomics data and supporting additional data integration through the DAS protocol.
Ensembl gene sets created using an automated analysis pipeline that has been significantly optimized based on the completeness of the genome sequence and the availability of species-specific supporting data.
– The Ensembl application programming interface (API) that allows programmatic access to all of our data sets including annotations, genomic alignments and variation data.
– BioMart data mining tools, which support sophisticated Ensembl-specific queries and federated queries with other BioMart-compliant data resources.
– An entirely open resource with all of our code and data freely available to all users.
Ensembl data is organized into several species-specific and multi-species MySQL databases. Each database is named using the format <species>_<database type>_<release number>_<data version>. For each supported species, a core database contains the DNA sequences, gene annotations, external references, etc. Databases of type ‘otherfeatures’ are provided for each supported species (except for the low-coverage genomes) and include EST genes, external annotation sets and other data. Variation databases that include dbSNP (5) and resequencing data (see subsequently), are provided for 10 species. This year, we introduced a functional genomics database, initially released for human and mouse, to support functional data types assayed by whole-genome tiling arrays or high-throughput sequencing (see subsequently). Comparative genomics data and the supporting data for the Ensembl BioMart datamining tool (6) are provided in multi-species databases.

Ensembl generally releases updates six times each year in February, April, June, August, October and December. Specific data updates are driven by the availability of new or updated genome sequence assemblies, significant increases in supporting evidence for genome annotations, updated releases of major external data sets [such as dbSNP (5)] that are incorporated into Ensembl, and new biological data resources such as protein–DNA interaction maps based on genome-wide ChIP-chip and ChIP-seq data sets. Each new Ensembl release may also include new data visualization options and improvements to the underlying software infrastructure.

This report lists only some of the new features, new data and other improvements that we have added to Ensembl since our last report (7). Users interested in the most up-to-date details of the Ensembl project should visit the Ensembl main page (http://www.ensembl.org) and follow the ‘What's new’ link and/or subscribe to the low-volume ‘Ensembl announce’ mailing list by sending email ‘subscribe ensembl-announce’ as the message body to majordomo{at}ebi.ac.uk. Other information about Ensembl features is available on the Ensembl help pages or by email at helpdesk{at}ensembl.org.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 FUTURE DIRECTIONS
 REFERENCES
 
Ensembl regulatory build
The Ensembl regulatory build is designed to automatically annotate all of the functional regulatory regions in the genome and assign putative functions to as many of these regions as possible. The initial release of the Ensembl regulatory build in June 2007, integrated eight genome-wide data sets, mainly in pre-publication ‘resource’ status, to identify ~110 000 regulatory features across the human genome. Briefly, the integration procedure starts with likely regulatory regions (such as DNase I hypersensitive sites) and seeks to identify the function of each site by analysing specific patterns of histone modification immediately adjacent to the region. We identified a number of patterns highly enriched for gene starts, genic regions and distal regions. Ensembl regulatory features are displayed on ContigView (Figure 1).


Figure 1
View larger version (96K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1. The Ensembl regulatory build and GERP conservation track. A 90 kb region of human chromosome 10 showing Ensembl regulatory features in blue, green and grey on the bottom track and the GERP conservation track at the top. Note the overlap of gene-associated regulatory features with the start regions of both Ensembl transcripts and EST transcripts suggesting a complex transcriptional environment. The conservation track is a composite track that displays the constrained elements by default and both constrained elements and the GERP scores when expanded.

 
Functional genomics database
As noted above, the Ensembl Functional Genomics Database is the fourth species-specific database that is part of the standard Ensembl release. The Functional Genomics Database and its associated API provide a platform for the storage, analysis and visualization of array-based functional genomics data. We have created an initial infrastructure for analysis of these data based on the Ensembl analysis pipeline (8). This structure supports the modular incorporation of analysis tools dedicated to various aspects of tiling array analysis such as normalization and platform-specific hit identification.

The database is currently used to support the Ensembl regulatory build (see above) and the display on of ChIP-chip data and analysis within Ensembl (Figure 2). The database and API feature a fully automated data import structure, an extensible array model and support for the Tab2MAGE metadata format (9). Additionally, the database is designed for deployment in external research laboratories and supports local data processing and visualization through DAS.


Figure 2
View larger version (49K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2. ChIP-chip display. Histone 3 lysine trimethylation data from mouse embryonic fibroblast cells (29) on mouse chromosome 17 in the region of the Q3UNB7 (ENSMUSG00000073442) and A630033E08Rik (ENSMUSG00000059142) genes. The Histone modifications display is a composite track that combines raw enrichment values and peak identifications. Displays that encompass large regions of the genome include only the identified peak regions.

 
Ensembl customization: user accounts and groups
The major new Ensembl website functionality over the past year is the addition of user and group accounts. These accounts enable users to create bookmarks, customize their Ensembl interface and share their bookmarks and configurations with other users in an Ensembl group. We note, importantly, that all Ensembl data is equally accessible to users whether or not they create an user account.

Ensembl user accounts are designed to personalize the Ensembl interface. As the number of data tracks in Ensembl has grown, the default visualization settings are not ideal for every user. For example, some users may be interested in displaying only the Ensembl genes track together with mapping of gene expression arrays and SNP locations, while other users may want a display consisting of constrained elements, RNA genes, the underlying clone tilepath, or any of more than one hundred available data tracks. These personalized interfaces can now be saved and shared through Ensembl accounts.

Ensembl Groups have several functions. The primary function is to share configurations, bookmarks, or notes with other members of the group. Single users can also create groups as virtual folders to organize bookmarks, configurations and notes-based separate projects. Groups may be created and administered by any user with an Ensembl account. Group administrators can invite anyone to join their group and users can be members of several groups simultaneously. All group members must also have Ensembl accounts.

Notes are currently supported on GeneView pages and allow users the option of creating their own annotations and have these integrated into the web display. Notes will be added to other pages in the future.

New species and improved gene annotations
The Ensembl website currently displays data for 41 species. In the past year, we have added data for seven new high coverage genomes and generated updated gene sets for eight species. Previously, we reported that four low-coverage (2x) genome gene sets were available with five more underway (7). During this year we have finished the both the gene sets in progress and sets for an additional five species [Spermophilus tridecemlineatus (squirrel), Tupaia belangeri (tree shrew), Cavia porcellus (guinea pig), Microcebus murinus (mouse lemur), Ochotona princeps (pika)]. This set of 14 low-coverage annotated genome sequences provides an extensive resource for mammalian comparative genomics.

We have continued the CCDS (Consensus Coding Sequence) collaboration with the Sanger Institute's Havana group (http://www.sanger.ac.uk/HGP/havana/), UCSC (3) and NCBI (4). CCDS is a stable set of protein-coding gene structures for which all consortium members agree on to the base pair. We have released an update to the set that includes 18 290 CDSs from 16 003 genes. This is a substantial improvement in gene coverage over the previous set which contained 14 795 CDSs from 13 142 genes. A CCDS set has also been generated for mouse, which includes 13 374 CDSs from 13 014 genes. Further updates to CCDS sets are in progress based on new human and mouse Ensembl gene builds, Refseq (10) builds and Havana annotation. Additional details regarding the CCDS project are available from http://www.ncbi.nlm.nih.gov/CCDS/.

The Ensembl gene build process is based on alignments of protein and cDNA sequences and in order to produce a high-quality gene set, it is crucial to maximize the value of species-specific sequence data and ensure the suitability of all input sequences. In light of this, we have made improvements to several stages of the automatic annotation process. Improved use of species-specific sequences primarily addressed gene models characterized by a short first CDS exon followed by a long (>10 000 bp) intron as well as those with non GT–AG splice sites. Using standard gene-wise (11) parameters, neither case was predicted well by the Ensembl pipeline. To address these cases, we now run gene-wise with two different parameter sets and also run exonerate (12), a faster alignment algorithm more suited to the longer genomic sequences required for accurate long intron prediction. The results of these three analyses for each protein are compared and the best gene prediction chosen on the basis of a set of rules including percentage identity of the model to the original protein. Using this improved method, the percentage of Refseq genes for which we produce at least one identical CDS model increased from 78% to 88% and for Havana genes from 79% to 88%. We have also improved the quality of the input sequence data by a careful filtering process that identifies anomalous sequences such as chimeric cDNAs, cDNAs with retained introns and viral proteins, and protein sequences derived from repeats. For example, we remove from our input sequence data all of the cDNAs annotated as chimeric by the Mammalian Gene Collection (13). Removing these protein and cDNA sequences from the Ensembl gene build input reduced artefactual gene merging and over prediction.

Two other notable gene build improvements represent incorporation of information not previously used by Ensembl. The first development concerns UTRs that are added from cDNAs, when the cDNA exon boundaries match those from the protein model. Often there is a choice of possible cDNAs with differing UTRs. We are now prioritizing these cDNA choices on whether they match the boundaries of paired end tags (ditags) experimentally derived from the starts and ends of cDNAs, providing a second source of evidence to accurately determine UTR boundaries. We are mapping ditag sequences from the Genome Institute of Singapore and from the Fantom project for human and mouse (14–16). The second enhancement is specific to immunoglobulin segments, which present problems for standard gene prediction methods because the somatic rearrangements of gene segment clusters make complete cDNAs difficult to align. We now align annotated segments from the IMGT database (17) for mouse and human. The predictions based on these replace any overlapping gene models produced by the standard Ensembl pipeline in the immunoglobulin gene clusters.

New gene builds in 2007 included updates to both human and mouse, which both benefited from the methodological improvements described above. For the case of mouse, the new gene build was in support of the newly released NCBI build 37 genome assembly, while the updated human gene build incorporates the latest Havana manual annotation set.

Resequencing data: new resources and visualization
New sequencing technologies are expected to make whole genome resequencing feasible on a large scale (18,19). The genome sequence for a single individual is already available using previous generation sequencing technology (20). We recently reported on TranscriptSNPView, a transcript-based visualization for resequencing data and our SSAHA-based (21) alignment of resequencing reads to the mouse genome (22). We have extended this technique and TranscriptSNPView over the past year to include resequenced human individuals and rat strains. This year we have developed additional resources for analysis and visualization of resequencing data. The new SequenceAlignView (Figure 3) displays the reference genome sequence together with the genome sequence of individuals (or strains in the case of mouse and rat). With this view, the exact sequence of the individual can be quickly determined and the differences between the sequenced individual and the reference genome assembly highlighted. Resequencing data is also provided in structured EMF (Ensembl Multi-Format) text files. On our FTP site, users doing comparative genomics will also find EMF files available for multiple sequence alignments.


Figure 3
View larger version (51K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 3. SequenceAlignView. A full screen shot of a region on mouse chromosome 8 displaying available resequencing data from the 129S1/SvImJ, 129X1/SvJ and A/J laboratory mouse strains (the 129S1/SvImJ stain is marked as having no data in the region). Numerous display options are in the top panel on the page, which allow user to choose any region of the genomes, highlight Ensembl annotations, locations of knows SNPs and other information. The resequencing alignment in the bottom panel identifies exons in red and SNPs in yellow. Links to individual variations are provided to the right of the resequencing alignment.

 
DAS extensions
Ensembl continues to make extensive use of the DAS protocol (23). During this year, we have released two new DAS resources. Previously, we extended the Ensembl genome browser with DAS client functionality, which allows researchers around the world to remotely host data sources and view these on major Ensembl displays including CytoView, ContigView, GeneView and ProtView (24). This year, we extended our client visualization support through DAS to include a colour gradient, histogram and tiling array ‘wiggle’ format (Figure 4). These new visualization options are particularly applicable to dense genome data such as that produced by whole-genome tiling array experiments. We now also serve current Ensembl data for integration into other DAS clients. Data available for integration into our DAS clients includes transcripts, ditag data, markers, karyotype information, repeats and DNA and protein align features including cDNA alignments and UniProt alignments. DAS sources setup by Ensembl are also automatically registered with the DAS registry (25). Instructions for using DAS with Ensembl are available from http://www.ensembl.org/info/data/external_data/das/index.html.


Figure 4
View larger version (49K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 4. DAS Visualizations. A 19 Mb region of human chromosome 11 showing identical data displayed with (from top to bottom) the colour gradient, histogram and tiling array ‘wiggle’ format. The colour gradient format transitions from yellow (low values) to blue (high value). The histogram display format supports merged data in bins across the genome; the display value is selectable to be either the average of the bin (shown here) or the maximum value in the bin to achieve greater data contrast. In the histogram format, the lowest value in the data set becomes the baseline. The tiling array format allows for the display of both positive and negative values with overlapping data points resulting in the maximum data point being displayed. All three display formats support in-line data normalization. The ideal format will depend on the data to be displayed. These example data are P-values from a genome-wide association study (30).

 
Ensembl software infrastructure
The Ensembl core software system (26) provides an efficient way of representing genome data in a relational database and providing access to it via an object-oriented API. This API is used by our computational pipelines to generate and store genome annotation, and by the Ensembl website to retrieve information that is to be displayed to the user. Bioinformaticians can use the API to access Ensembl databases remotely (Ensembl databases are available at mysql://ensembldb.ensembl.org:3306; Ensembl BioMart databases use mysql://martdb.ensembl.org:3316) or local databases containing their own data. We maintain full unit test coverage for the API.

The database representation and API are being continuously developed to address bottlenecks affecting website and pipeline performance and increase flexibility. While most of this development is incremental in nature, two significant improvements over the past year merit special mention. First, the mechanism that links the identifiers between Ensembl genes, transcripts and translations and their counterparts in external databases has been significantly improved and extended, including a new configuration system allowing us to appropriately address specific data types and relationships between external and Ensembl data. Second, we have expanded the automatic data quality checks that are vital to ensuring that the billions individual pieces of Ensembl data are as accurate as possible. There are now nearly 300 such tests that run in advance of each Ensembl release.

Comparative genomics
The protein tree calculation pipeline has evolved since last year with closer collaboration with the TreeFam project (http://www.treefam.org). TreeBeST software (http://treesoft.sourceforge.net) is used to both build a protein tree and reconcile it with the species tree. This reconciliation step allows us to call duplication and speciation events in the tree. Next, we check for dubious duplication events. These correspond to prediction where a duplication event is followed by a large number of gene loss events. Finally, we can infer paralogy and orthology relationships between the genes using the resulting protein tree.

Multiple genomic alignments are now calculated using Pecan (http://www.ebi.ac.uk/~bjp/pecan/) as it has been shown to be one of the best algorithms in terms of specificity and sensitivity (27). The new set of alignments includes the platypus genome. Each position in these alignments is further analysed to evaluate the level of evolutionary constraint using GERP as previously described (28). GERP also defines stretches of the Pecan alignments with a high level of conservation called constrained elements (Figure 1).

Data mining for comparative genomics
ComparaMart is a new data mining tool created to allow researchers to create intuitive queries against the Ensembl Compara multi-species database. ComparaMart uses the BioMart (6) data federation technology and provides a powerful, flexible tool to access a subset of the Compara data including predictions of homologues proteins and whole genome alignments.

As noted above, the Compara database stores results of genome-wide species comparisons calculated for each release. The ComparaMart database includes three main data sets: Ensembl homology, Ensembl pair-wise alignments and Ensembl multiple alignments. Through the ComparaMart interface, users may access the Ensembl homology data set to retrieve orthology or paralogy information for two species including various identifiers, homology descriptions, DNA/peptide sequences and peptide alignments. Additionally, the Ensembl homology data can also be linked to any Ensembl species-specific data sets to build more complex queries such as a list of all SNPs in human and mouse one-to-one orthologues. Specific data mining for pair-wise and multi-species whole-genome alignments are accessible through their respective data sets, although the multiple alignments data set includes only the constrained elements defined by GERP (28) from the Pecan alignments of 10 amniota vertebrates.

Outreach
Ensembl continuously tries to enhance the user experience and for this purpose we are in touch with our user community. This year we added video tutorials at http://www.ensembl.org/info/helpdesk/tutorials/index.html and continue to provide on-site courses on request. In an effort to gather information from Ensembl users and better understand how people use Ensembl, we recently conducted our second major user survey. More than 450 people responded primarily from Europe and North America. The results show overall satisfaction with Ensembl's tools and resources. For example, the most important aspects of Ensembl are accurate information (60% of respondents), followed by high-quality data visualization (41%), constant availability (36%), and good data mining tools (33%). Interestingly, the most common user concern was also related to data visualization, specifically the complexity of the Ensembl web interface. We are have already responded to several aspects of the survey and plan to make significant improvements to the web interface in 2008 to address the concerns raised.


    FUTURE DIRECTIONS
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 FUTURE DIRECTIONS
 REFERENCES
 
The success of massively parallel sequencing technologies is a significant challenge for bioinformatics resources, although one that has been at least partially anticipated by Ensembl. We envision many ways this new technology will impact Ensembl over the coming year. We expect that resequencing data will be a significant part of Ensembl development over the next year and are working to scale our resequencing and variation resources appropriately. The sequencing technologies have likely made whole genome tiling array analysis obsolete (at least for ChIP) and we are adapting our functional genomics database for ChIP-seq analysis support. We anticipate continued enhancements of the Ensembl regulatory build as new genome-wide data sets become available through projects such as ENCODE. Finally we expect that new transcriptomics data sets will help us guide the Ensembl gene build both in terms of improving currently supported species and mapping transcription in newly sequenced genomes.


    ACKNOWLEDGEMENTS
 
The Ensembl project receives primary funding from the Wellcome Trust. Additional funding is provided by EMBL, NHGRI, NIH-NIAID, BBSRC, MRC and the European Union. We acknowledge those researchers and organizations (especially Greg Crawford, Martin Hirst and the STAR Consortium) that have provided data to Ensembl prior to publication under the understandings of the Fort Lauderdale meeting discussing Community Resource Projects. We thank all of the users of our website and other resources, and those who have provided useful feedback though our mailing list. Funding to pay the Open Access publication charges for this article was provided by the Wellcome Trust.

Conflict of interest statement. None declared.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 RESULTS
 FUTURE DIRECTIONS
 REFERENCES
 

  1. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature (2001) 409:860–921.[CrossRef][Medline]

  2. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, et al. The sequence of the human genome. Science (2001) 291:1304–1351.[Abstract/Free Full Text]

  3. Kuhn RM, Karolchik D, Zweig AS, Trumbower H, Thomas DJ, Thakkapallayil A, Sugnet CW, Stanke M, Smith KE, et al. The UCSC genome browser database: update 2007. Nucleic Acids Res. (2007) 35:D668–D673.[Abstract/Free Full Text]

  4. Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. (2007) 35:D5–D12.[Abstract/Free Full Text]

  5. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. (2001) 29:308–311.[Abstract/Free Full Text]

  6. Kasprzyk A, Keefe D, Smedley D, London D, Spooner W, Melsopp C, Hammond M, Rocca-Serra P, Cox T, et al. EnsMart: a generic system for fast and flexible access to biological data. Genome Res. (2004) 14:160–169.[Abstract/Free Full Text]

  7. Hubbard TJ, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, et al. Ensembl 2007. Nucleic Acids Res. (2007) 35:D610–D617.[Abstract/Free Full Text]

  8. Potter SC, Clarke L, Curwen V, Keenan S, Mongin E, Searle SM, Stabenau A, Storey R, Clamp M. The Ensembl analysis pipeline. Genome Res. (2004) 14:934–941.[Abstract/Free Full Text]

  9. Rayner TF, Rocca-Serra P, Spellman PT, Causton HC, Farne A, Holloway E, Irizarry RA, Liu J, Maier DS, et al. A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. BMC Bioinformatics (2006) 7:489.[CrossRef][Medline]

  10. Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. (2007) 35:D61–D65.[Abstract/Free Full Text]

  11. Birney E, Clamp M, Durbin R. GeneWise and Genomewise. Genome Res. (2004) 14:988–995.[Abstract/Free Full Text]

  12. Slater GS, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics (2005) 6:31.[CrossRef][Medline]

  13. MGC Project Team. The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res. (2004) 14:2121–2127.[Abstract/Free Full Text]

  14. Ng P, Wei CL, Sung WK, Chiu KP, Lipovich L, Ang CC, Gupta S, Shahab A, Ridwan A, et al. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat. Methods (2005) 2:105–111.[CrossRef][Web of Science][Medline]

  15. Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, et al. The transcriptional landscape of the mammalian genome. Science (2005) 309:1559–1563.[Abstract/Free Full Text]

  16. Ruan Y, Ooi HS, Choo SW, Chiu KP, Zhao XD, Srinivasan KG, Yao F, Choo CY, Liu J, et al. Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs). Genome Res. (2007) 17:828–838.[Abstract/Free Full Text]

  17. Lefranc MP, Giudicelli V, Kaas Q, Duprat E, Jabado-Michaloud J, Scaviner D, Ginestoux C, Clément O, Chaume D, et al. IMGT, the international ImMunoGeneTics information system. Nucleic Acids Res. (2005) 33:D593–D597.[Abstract/Free Full Text]

  18. Mardis ER. Anticipating the 1,000 dollar genome. Genome Biol. (2006) 7:112.[CrossRef][Medline]

  19. Bentley DR. Whole-genome re-sequencing. Curr. Opin. Genet. Dev. (2006) 16:545–552.[CrossRef][Web of Science][Medline]

  20. Levy S, Sutton G, Ng PC, Feuk L, Halpern AL, Walenz BP, Axelrod N, Huang J, Kirkness EF, et al. The diploid genome sequence of an individual human. PLoS Biol. (2007) 5:e254.[CrossRef][Medline]

  21. Ning Z, Cox AJ, Mullikin JC. SSAHA: a fast search method for large DNA databases. Genome Res. (2001) 11:1725–1729.[Abstract/Free Full Text]

  22. Cunningham F, Rios D, Griffiths M, Smith J, Ning Z, Cox T, Flicek P, Marin-Garcin P, Herrero J, et al. TranscriptSNPView: a genome-wide catalog of mouse coding variation. Nat. Genet. (2006) 38:853.[Web of Science][Medline]

  23. Dowell RD, Jokerst RM, Day A, Eddy SR, Stein L. The distributed annotation system. BMC Bioinformatics (2001) 2:7.[CrossRef][Medline]

  24. Birney E, Andrews D, Caccamo M, Chen Y, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, et al. Ensembl 2006. Nucleic Acids Res. (2006) 34:D556–D561.[Abstract/Free Full Text]

  25. Prlic A, Down TA, Kulesha E, Finn RD, Kahari A, Hubbard TJ. Integrating sequence and structural biology with DAS. BMC Bioinformatics (2007) 8:333.[CrossRef][Medline]

  26. Stabenau A, McVicker G, Melsopp C, Proctor G, Clamp M, Birney E. The Ensembl core software libraries. Genome Res. (2004) 14:929–933.[Abstract/Free Full Text]

  27. Margulies EH, Cooper GM, Asimenos G, Thomas DJ, Dewey CN, Siepel A, Birney E, Keefe D, Schwartz AS, et al. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res. (2007) 17:760–774.[Abstract/Free Full Text]

  28. Cooper GM, Stone EA, Asimenos G. NISC Comparative Sequencing Program. Green ED, Batzoglou S, Sidow A. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. (2005) 15:901–913.[Abstract/Free Full Text]

  29. Regha K, Sloane MA, Huang R, Pauler FM, Warczok KE, Melikant B, Radolf M, Martens JH, Schotta G, et al. Active and repressive chromatin are interspersed without spreading in an imprinted gene cluster in the mammalian genome. Mol. Cell (2007) 27:353–366.[CrossRef][Web of Science][Medline]

  30. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature (2007) 447:661–678.[CrossRef][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
S. Podder and T. C. Ghosh
Exploring the Differences in Evolutionary Rates between Monogenic and Polygenic Disease Genes in Human
Mol. Biol. Evol., April 1, 2010; 27(4): 934 - 941.
[Abstract] [Full Text] [PDF]


Home page
Gen Biol EvolHome page
J. J. Cai, E. Borenstein, R. Chen, and D. A. Petrov
Similarly Strong Purifying Selection Acts on Human Disease Genes of All Evolutionary Ages
Gen Biol Evol, March 1, 2010; 2009(0): 131 - 144.
[Abstract] [Full Text] [PDF]


Home page
Gen Biol EvolHome page
P. Polak and P. F. Arndt
Long-Range Bidirectional Strand Asymmetries Originate at CpG Islands in the Human Genome
Gen Biol Evol, March 1, 2010; 2009(0): 189 - 197.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
N. Melzer, C. Villmann, K. Becker, K. Harvey, R. J. Harvey, N. Vogel, C. J. Kluck, M. Kneussel, and C.-M. Becker
Multifunctional Basic Motif in the Glycine Receptor Intracellular Domain Induces Subunit-specific Sorting
J. Biol. Chem., February 5, 2010; 285(6): 3730 - 3739.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Ruzanov and D. L. Riddle
Deep SAGE analysis of the Caenorhabditis elegans transcriptome
Nucleic Acids Res., February 3, 2010; (2010): gkq035v1 - gkq035.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
F. Belinky, O. Cohen, and D. Huchon
Large-Scale Parsimony Analysis of Metazoan Indels in Protein-Coding Genes
Mol. Biol. Evol., February 1, 2010; 27(2): 441 - 451.
[Abstract] [Full Text] [PDF]


Home page
Sci SignalHome page
J. V. Olsen, M. Vermeulen, A. Santamaria, C. Kumar, M. L. Miller, L. J. Jensen, F. Gnad, J. Cox, T. S. Jensen, E. A. Nigg, et al.
Quantitative Phosphoproteomics Reveals Widespread Full Phosphorylation Site Occupancy During Mitosis
Sci. Signal., January 12, 2010; 3(104): ra3 - ra3.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. Turro, A. Lewin, A. Rose, M. J. Dallman, and S. Richardson
MMBGX: a method for estimating expression at the isoform level and detecting differential splicing using whole-transcript Affymetrix arrays
Nucleic Acids Res., January 1, 2010; 38(1): e4 - e4.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
O. Aparicio, E. Carnero, X. Abad, N. Razquin, E. Guruceaga, V. Segura, and P. Fortes
Adenovirus VA RNA-derived miRNAs target cellular genes involved in cell growth, gene expression and DNA repair
Nucleic Acids Res., January 1, 2010; 38(3): 750 - 763.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
J. Binkley, K. Karra, A. Kirby, M. Hosobuchi, E. A. Stone, and A. Sidow
ProPhylER: A curated online resource for protein function and structure based on evolutionary constraint analyses
Genome Res., January 1, 2010; 20(1): 142 - 154.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. W. Huss III, P. Lindenbaum, M. Martone, D. Roberts, A. Pizarro, F. Valafar, J. B. Hogenesch, and A. I. Su
The Gene Wiki: community intelligence applied to human gene annotation
Nucleic Acids Res., January 1, 2010; 38(suppl_1): D633 - D639.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
The UniProt Consortium
The Universal Protein Resource (UniProt) in 2010
Nucleic Acids Res., January 1, 2010; 38(suppl_1): D142 - D148.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Flicek, B. L. Aken, B. Ballester, K. Beal, E. Bragin, S. Brent, Y. Chen, P. Clapham, G. Coates, S. Fairley, et al.
Ensembl's 10th year
Nucleic Acids Res., January 1, 2010; 38(suppl_1): D557 - D562.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Boros, A. O'Donnell, I. J. Donaldson, A. Kasza, L. Zeef, and A. D. Sharrocks
Overlapping promoter targeting by Elk-1 and other divergent ETS-domain transcription factor family members
Nucleic Acids Res., December 1, 2009; 37(22): 7368 - 7380.
[Abstract] [Full Text] [PDF]


Home page
J. Immunol.Home page
S. E. Degn, A. G. Hansen, R. Steffensen, C. Jacobsen, J. C. Jensenius, and S. Thiel
MAp44, a Human Protein Associated with Pattern Recognition Molecules of the Complement System and Regulating the Lectin Pathway of Complement Activation
J. Immunol., December 1, 2009; 183(11): 7371 - 7378.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. L. Hufton, S. Mathia, H. Braun, U. Georgi, H. Lehrach, M. Vingron, A. J. Poustka, and G. Panopoulou
Deeply conserved chordate noncoding sequences preserve genome synteny but do not drive gene duplicate retention
Genome Res., November 1, 2009; 19(11): 2036 - 2051.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. G. Roider, B. Lenhard, A. Kanhere, S. A. Haas, and M. Vingron
CpG-depleted promoters harbor tissue-specific transcription factor binding signals--implications for motif overrepresentation analyses
Nucleic Acids Res., October 1, 2009; 37(19): 6305 - 6315.
[Abstract] [Full Text] [PDF]


Home page
Toxicol SciHome page
S. Ahmed, E. Valen, A. Sandelin, and J. Matthews
Dioxin Increases the Interaction Between Aryl Hydrocarbon Receptor and Estrogen Receptor Alpha at Human Promoters
Toxicol. Sci., October 1, 2009; 111(2): 254 - 266.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
J. D. Freeman, R. L. Warren, J. R. Webb, B. H. Nelson, and R. A. Holt
Profiling the T-cell receptor beta-chain repertoire by massively parallel sequencing
Genome Res., October 1, 2009; 19(10): 1817 - 1824.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
J. Zeng, S. Zhu, and H. Yan
Towards accurate human promoter recognition: a review of currently used sequence features and classification methods
Brief Bioinform, September 1, 2009; 10(5): 498 - 508.
[Abstract] [Full Text] [PDF]


Home page
DevelopmentHome page
F. Bollig, B. Perner, B. Besenbeck, S. Kothe, C. Ebert, S. Taudien, and C. Englert
A highly conserved retinoic acid responsive element controls wt1a expression in the zebrafish pronephros
Development, September 1, 2009; 136(17): 2883 - 2892.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. K. Auerbach, G. Euskirchen, J. Rozowsky, N. Lamarre-Vincent, Z. Moqtaderi, P. Lefrancois, K. Struhl, M. Gerstein, and M. Snyder
Mapping accessible chromatin regions using Sono-Seq
PNAS, September 1, 2009; 106(35): 14926 - 14931.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. J. Lawson and L. Zhang
Sexy gene conversions: locating gene conversions on the X-chromosome
Nucleic Acids Res., August 1, 2009; 37(14): 4570 - 4579.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
M. Hilger, T. Bonaldi, F. Gnad, and M. Mann
Systems-wide Analysis of a Phosphatase Knock-down by Quantitative Proteomics and Phosphoproteomics
Mol. Cell. Proteomics, August 1, 2009; 8(8): 1908 - 1920.
[Abstract] [Full Text] [PDF]


Home page
Sci SignalHome page
C. S. H. Tan, B. Bodenmiller, A. Pasculescu, M. Jovanovic, M. O. Hengartner, C. Jorgensen, G. D. Bader, R. Aebersold, T. Pawson, and R. Linding
Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases
Sci. Signal., July 28, 2009; 2(81): ra39 - ra39.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
R. J. Taft, E. A. Glazov, T. Lassmann, Y. Hayashizaki, P. Carninci, and J. S. Mattick
Small RNAs derived from snoRNAs
RNA, July 1, 2009; 15(7): 1233 - 1240.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
I. R. E. Nett, D. M. A. Martin, D. Miranda-Saavedra, D. Lamont, J. D. Barber, A. Mehlert, and M. A. J. Ferguson
The Phosphoproteome of Bloodstream Form Trypanosoma brucei, Causative Agent of African Sleeping Sickness
Mol. Cell. Proteomics, July 1, 2009; 8(7): 1527 - 1538.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Glez-Pena, G. Gomez-Lopez, D. G. Pisano, and F. Fdez-Riverola
WhichGenes: a web-based tool for gathering, building, storing and exporting gene sets with application in gene set enrichment analysis
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W329 - W334.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Kuzniar, K. Lin, Y. He, H. Nijveen, S. Pongor, and J. A. M. Leunissen
ProGMap: an integrated annotation resource for protein orthology
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W428 - W434.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
K. Mochida, T. Yoshida, T. Sakurai, Y. Ogihara, and K. Shinozaki
TriFLDB: A Database of Clustered Full-Length Coding Sequences from Triticeae with Applications to Comparative Grass Genomics
Plant Physiology, July 1, 2009; 150(3): 1135 - 1146.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
H. A. Blomster, V. Hietakangas, J. Wu, P. Kouvonen, S. Hautaniemi, and L. Sistonen
Novel Proteomics Strategy Brings Insight into the Prevalence of SUMO-2 Target Sites
Mol. Cell. Proteomics, June 1, 2009; 8(6): 1382 - 1390.
[Abstract] [Full Text] [PDF]


Home page
Biol. Reprod.Home page
S. T. Bradford, R. Hiramatsu, M. P. Maddugoda, P. Bernard, M.-C. Chaboissier, A. Sinclair, A. Schedl, V. Harley, Y. Kanai, P. Koopman, et al.
The Cerebellin 4 Precursor Gene Is a Direct Target of SRY and SOX9 in Mice
Biol Reprod, June 1, 2009; 80(6): 1178 - 1188.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F.-O. Desmet, D. Hamroun, M. Lalande, G. Collod-Beroud, M. Claustres, and C. Beroud
Human Splicing Finder: an online bioinformatics tool to predict splicing signals
Nucleic Acids Res., May 1, 2009; 37(9): e67 - e67.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
R. de Sousa Abreu, P. C. Sanchez-Diaz, C. Vogel, S. C. Burns, D. Ko, T. L. Burton, D. T. Vo, S. Chennasamudaram, S.-Y. Le, B. A. Shapiro, et al.
Genomic Analyses of Musashi1 Downstream Targets Show a Strong Association with Cancer-related Processes
J. Biol. Chem., May 1, 2009; 284(18): 12125 - 12135.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Talavera, R. A. Laskowski, and J. M. Thornton
WSsas: a web service for the annotation of functional residues through structural homologues
Bioinformatics, May 1, 2009; 25(9): 1192 - 1194.
[Abstract] [Full Text] [PDF]


Home page
Circ Cardiovasc GenetHome page
T. Wang and T. S. Furey
Analysis of Complex Disease Association and Linkage Studies Using the University of California Santa Cruz Genome Browser
Circ Cardiovasc Genet, April 1, 2009; 2(2): 199 - 204.
[Full Text] [PDF]


Home page
DevelopmentHome page
A. T. Garnett, T. M. Han, M. J. Gilchrist, J. C. Smith, M. B. Eisen, F. C. Wardle, and S. L. Amacher
Identification of direct T-box target genes in the developing zebrafish mesoderm
Development, March 1, 2009; 136(5): 749 - 760.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Toll-Riera, N. Bosch, N. Bellora, R. Castelo, L. Armengol, X. Estivill, and M. Mar Alba
Origin of Primate Orphan Genes: A Comparative Genomics Approach
Mol. Biol. Evol., March 1, 2009; 26(3): 603 - 612.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
P. Tongaonkar and M. E. Selsted
SDF2L1, a Component of the Endoplasmic Reticulum Chaperone Complex, Differentially Interacts with {alpha}-, {beta}-, and {theta}-Defensin Propeptides
J. Biol. Chem., February 27, 2009; 284(9): 5602 - 5609.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. L. Warren, B. H. Nelson, and R. A. Holt
Profiling model T-cell metagenomes with short reads
Bioinformatics, February 15, 2009; 25(4): 458 - 464.
[Abstract] [Full Text] [PDF]


Home page
J R Soc InterfaceHome page
G. A Reeves, D. Talavera, and J. M Thornton
Genome and proteome annotation: organization, interpretation and integration
J R Soc Interface, February 6, 2009; 6(31): 129 - 147.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Waegele, I. Dunger-Kaltenbach, G. Fobo, C. Montrone, H.-W. Mewes, and A. Ruepp
CRONOS: the cross-reference navigation server
Bioinformatics, January 1, 2009; 25(1): 141 - 143.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. Portales-Casamar, D. Arenillas, J. Lim, M. I. Swanson, S. Jiang, A. McCallum, S. Kirov, and W. W. Wasserman
The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D54 - D60.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Cochrane, R. Akhtar, J. Bonfield, L. Bower, F. Demiralp, N. Faruque, R. Gibson, G. Hoad, T. Hubbard, C. Hunter, et al.
Petabyte-scale innovations at the European Nucleotide Archive
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D19 - D25.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Hulsen, P. M. A. Groenen, J. de Vlieg, and W. Alkema
PhyloPat: an updated version of the phylogenetic pattern database contains gene neighborhood
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D731 - D737.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
The UniProt Consortium
The Universal Protein Resource (UniProt) 2009
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D169 - D174.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Keerthikumar, R. Raju, K. Kandasamy, A. Hijikata, S. Ramabadran, L. Balakrishnan, M. Ahmed, S. Rani, L. D. N. Selvan, D. S. Somanathan, et al.
RAPID: Resource of Asian Primary Immunodeficiency Diseases
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D863 - D867.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Kamburov, C. Wierling, H. Lehrach, and R. Herwig
ConsensusPathDB--a database for integrating human functional interaction networks
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D623 - D628.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J.-M. Rouillard and E. Gulari
OligoArrayDb: pangenomic oligonucleotide microarray probe sets database
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D938 - D941.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Gnad, M. Oroshi, E. Birney, and M. Mann
MAPU 2.0: high-accuracy proteomes mapped to genomes
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D902 - D906.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Lefever, J. Vandesompele, F. Speleman, and F. Pattyn
RTPrimerDB: the portal for real-time PCR primers and probes
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D942 - D945.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
U. Pieper, N. Eswar, B. M. Webb, D. Eramian, L. Kelly, D. T. Barkan, H. Carter, P. Mankoo, R. Karchin, M. A. Marti-Renom, et al.
MODBASE, a database of annotated comparative protein structure models and associated resources
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D347 - D354.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Barrell, E. Dimmer, R. P. Huntley, D. Binns, C. O'Donovan, and R. Apweiler
The GOA database in 2009--an integrated Gene Ontology Annotation resource
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D396 - D403.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. L. Papadopoulos, M. Reczko, V. A. Simossis, P. Sethupathy, and A. G. Hatzigeorgiou
The database of experimentally supported targets: a functional update of TarBase
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D155 - D158.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Y. K. Lam, E. Khurana, G. Fang, P. Cayting, N. Carriero, K.-H. Cheung, and M. B. Gerstein
Pseudofam: the pseudogene families database
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D738 - D743.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Nogales-Cadenas, F. Abascal, J. Diez-Perez, J. M. Carazo, and A. Pascual-Montano
CentrosomeDB: a human centrosomal proteins database
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D175 - D180.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Ding, P. Lorenz, M. Kreutzer, Y. Li, and H.-J. Thiesen
SysZNF: the C2H2 zinc finger gene database
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D267 - D273.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. Letunic, T. Doerks, and P. Bork
SMART 6: recent updates and new developments
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D229 - D232.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. D. R. Croning, M. C. Marshall, P. McLaren, J. D. Armstrong, and S. G. N. Grant
G2Cdb: the Genes to Cognition database
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D846 - D851.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Bhasi, P. Philip, V. Manikandan, and P. Senapathy
ExDom: an integrated database for comparative analysis of the exon-intron structures of protein domains in eukaryotes
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D703 - D711.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Miranda-Saavedra, S. De, M. W. Trotter, S. A. Teichmann, and B. Gottgens
BloodExpress: a database of gene expression in mouse haematopoiesis
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D873 - D879.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. J. Richardson, Q. Gao, C. Mitsopoulous, M. Zvelebil, L. H. Pearl, and F. M. G. Pearl
MoKCa database--mutations of kinases in cancer
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D824 - D831.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. A. Samarajiwa, S. Forster, K. Auchettl, and P. J. Hertzog
INTERFEROME: the database of interferon regulated genes
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D852 - D857.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. M. Kuhn, D. Karolchik, A. S. Zweig, T. Wang, K. E. Smith, K. R. Rosenbloom, B. Rhead, B. J. Raney, A. Pohl, M. Pheasant, et al.
The UCSC Genome Browser Database: update 2009
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D755 - D761.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Lawson, P. Arensburger, P. Atkinson, N. J. Besansky, R. V. Bruggner, R. Butler, K. S. Campbell, G. K. Christophides, S. Christley, E. Dialynas, et al.
VectorBase: a data resource for invertebrate vector genomics
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D583 - D587.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. J. P. Hubbard, B. L. Aken, S. Ayling, B. Ballester, K. Beal, E. Bragin, S. Brent, Y. Chen, P. Clapham, L. Clarke, et al.
Ensembl 2009
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D690 - D697.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Xi, J. Park, G. Ding, Y.-H. Lee, and Y. Li
SysPIMP: the web-based systematical platform for identifying human disease-related mutated sequences from mass spectrometry
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D913 - D920.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. E. Mabey Gilsenan, G. Atherton, J. Bartholomew, P. F. Giles, T. K. Attwood, D. W. Denning, and P. Bowyer
Aspergillus Genomes and the Aspergillus Cloud
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D509 - D514.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Robinson, M. J. Waller, S. C. Fail, H. McWilliam, R. Lopez, P. Parham, and S. G. E. Marsh
The IMGT/HLA database
Nucleic Acids Res., January 1, 2009; 37(suppl_1): D1013 - D1017.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
D. Smedley, M. A. Swertz, K. Wolstencroft, G. Proctor, M. Zouberakis, J. Bard, J. M. Hancock, and P. Schofield
Solutions for data integration in functional genomics: a critical assessment and case study
Brief Bioinform, November 1, 2008; 9(6): 532 - 544.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
Y. Andachi
A novel biochemical method to identify target genes of individual microRNAs: Identification of a new Caenorhabditis elegans let-7 target
RNA, November 1, 2008; 14(11): 2440 - 2451.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A. Kosmrlj, A. K. Jha, E. S. Huseby, M. Kardar, and A. K. Chakraborty
How the thymus designs antigen-specific and self-tolerant T cell receptor sequences
PNAS, October 28, 2008; 105(43): 16671 - 16676.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (9116K) Freely available
Right arrow Screen PDF (1134K) Freely available
Right arrowOA All Versions of this Article:
36/suppl_1/D707    most recent
gkm988v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowScopus Links
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Flicek, P.
Right arrow Articles by Searle, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Flicek, P.
Right arrow Articles by Searle, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?