Nucleic Acids Research Advance Access published online on December 5, 2006
Nucleic Acids Research, doi:10.1093/nar/gkl996
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Database Issue |
Ensembl 2007
Wellcome Trust Sanger Institute Hinxton, Cambridgeshire CB10 1SA, UK 1 European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus Hinxton, Cambridgeshire CB10 1SA, UK
*To whom correspondence should be addressed. Tel: +44 1223 496886; Fax: +44 1223 496802; Email: th{at}sanger.ac.uk
Received September 23, 2006. Revised October 27, 2006. Accepted October 30, 2006.
| ABSTRACT |
|---|
|
|
|---|
The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of chordate genome sequences. Over the past year the number of genomes available from Ensembl has increased from 15 to 33, with the addition of sites for the mammalian genomes of elephant, rabbit, armadillo, tenrec, platypus, pig, cat, bush baby, common shrew, microbat and european hedgehog; the fish genomes of stickleback and medaka and the second example of the genomes of the sea squirt (Ciona savignyi) and the mosquito (Aedes aegypti). Some of the major features added during the year include the first complete gene sets for genomes with low-sequence coverage, the introduction of new strain variation data and the introduction of new orthology/paralog annotations based on gene trees.
| INTRODUCTION |
|---|
|
|
|---|
The genome sequence of an organism provides a natural index for organizing and understanding biological data. Ensembl is a software system to store, analyze, use and display genomic information. Ensembl's primary focus is around providing gene annotation and comparative genome integration for chordate genomes, the vast majority of which are vertebrates. Ensembl concentrates particularly on mammalian genomes having developed initially around the human genome sequence. Some major distinguishing features of the project compared to other major sites providing access to these genomes at UCSC (1) and NCBI (2) are that Ensembl creates a gene set, using an automatic gene build pipeline, for each species for which no manually curated gene set exists and that Ensembl makes all its data and software source code available to all users to encourage its reuse and programatic access. As discussed below, for the relatively mature vertebrate genomes of human and mouse where there are active efforts to curate gene structures by the Havana group through Vega (3) the RefSeq group (2) and UniProt (4) Ensembl is actively collaborating with these groups to converge on the reference gene set.
The genomes of 28 chordates are currently available through Ensembl, from mammals such as human and mouse through to the primitive chordates Ciona intestinalis and Ciona savignyi. Ensembl sites for the genomes of three key eukaryote model organisms, yeast (Saccharomyces cerevisiae), fruitfly (Drosophila melanogaster) and nematode (Caenorhabditis elegans), are provided to allow integration, primarily around predicted ortholog/paralog relationships, between these organisms and chordates within a common database environment. For these organisms, no automatic gene build is performed and instead gene annotation is imported from their respective model organism databases. Finally, a number of insect genomes are also available through Ensembl due to Ensembl's participation in the Vectorbase consortium (http://www.vectorbase.org/). Vectorbase is an NIHNIAID Bioinformatics Resource Center for Invertebrate Vectors of Human Pathogens and is using the Ensembl platform to present key vector genomes. It is likely that in the future these vector genomes will only be available on the Vectorbase site, though Vectorbase will continue to use Ensembl software. This year's increase in the number of genomes provided by Ensembl is the largest so far (Figure 1). More than half the genomes in Ensembl are mammalian. In keeping with the focus on chordates, Ensembl has this year stopped providing an Ensembl site for the honeybee (Apis mellifera) genome, while being involved in the initial analysis and producing an initial geneset (5), just as the previous year it stopped providing a site for the second nematode Caenorhabditis briggsae. This is because it recognizes that access to these genomes is being provided by the dedicated Beebase (http://racerx00.tamu.edu/bee_resources.html) and Wormbase (6) model organism databases. In both cases the old Ensembl databases for these genomes remain accessible via Ensembl archive sites.
|
Ensembl provides a variety of ways to access these data to suit different audiences and types of use. The majority of researchers using Ensembl use the website (http://www.ensembl.org/) and can rapidly locate individual items of interest either by entering keywords or from the built-in sequence similarity search interface. For cases where researchers are working with sets of items, such as a particular class of genes, Ensembl provides data mining tools via the BioMart system (7). For bioinformaticians Ensembl provides access to all the data behind the Ensembl website both as downloadable datasets and by allowing programatic access to databases hosted on the Ensembl site (ensembldb.ensembl.org). The later is growing in popularity as complete database dumps become large to download. Increasingly bioinformaticians are carrying out their own custom data analysis by accessing the databases remotely via the Perl language application programing interfaces (APIs) that Ensembl provides. Extensive documentation and tutorials are provided to help researchers get started programing using the APIs as well as describing the database schemas (http://www.ensembl.org/info/software). Ensembl runs courses and training around the world, has a full-time helpdesk and online tutorial materials (http://www.ensembl.org/info). Over the year courses have been held in the following countries: UK (16x), Austria (2x), Belgium (4x), Finland (2x), France (2x), Germany (2x), Hungary, Italy (6x), Spain (4x), USA (3x), Brazil (2x), South Africa (2x), Australia (2x) and Singapore. There are many practical details concerning data processing algorithms developed and used by Ensembl and the overall system's design and operation. For detailed descriptions, researchers are referred to the series of papers published in 2004 that describe both technical aspects of the software implementation and the scientific aspects of the genome annotation system (716). Whilst the system has evolved considerably since these articles were published, they provide a background to the technical documentation maintained on the Ensembl website and distributed with each software release. As an open data, open source software project Ensembl encourages participation and discussion on development issues mostly via the development email list (send subscribe ensembl-dev to majordomo{at}ebi.ac.uk to join). We are seeing this email list being increasingly used to exchange advice on API usage.
Ensembl continues to improve both in terms of the analysis of genome information and its usability both via programatic means and for web-based browsers. This paper details only some of the major improvements since the last report (17). For more comprehensive information about new features and data contained in the bi-monthly updates of Ensembl, researchers are also recommended to read the what's new pages accompanying every release (http://www.ensembl.org/Multi/newsview) and/or subscribe to the announce email list by sending subscribe ensembl-announce to majordomo{at}ebi.ac.uk.
| RESULTS |
|---|
|
|
|---|
Improvements to protein-coding genes
Providing expressed gene sets which are as accurate as possible is one of the major goals in Ensembl. Ensembl gene sets are all based on evidence from alignments of protein and cDNA sequences to genomic sequence. The completeness of each gene set depends on the amount of transcript data, either specific for the genome in question, or evolutionarily close enough to be reliably aligned. Gene set accuracy depends on alignment quality and being able to reconcile evidence, including detecting erroneous data from truncated and chimeric transcripts and identifying pseudogenes. This year's major improvements and changes to the Ensembl gene build systems and strategy cover the following three different situations for gene building: (i) the new class of low-sequence coverage mammalian genomes; (ii) high-coverage genomes which have little organism-specific transcript data; and (iii) the high-quality reference genomes of human and mouse.
New projection build pipeline for low-sequence coverage genomes
Ensembl has this year incorporated the first nine genomes which have been sequenced at low-coverage [2x whole genome shotgun (WGS)]. These are elephant (Loxodonta africana), rabbit (Oryctolagus cuniculus), armadillo (Dasypus novemcinctus), tenrec (Echinops telfairi), cat (Felis catus), bush baby (Otolemur garnettii), common shrew (Sorex araneus), microbat (Myotis Lucifugus) and european hedgehog (Erinaceus europaeus). In total 16 mammals will be sequenced at this coverage as part of the Mammalian Genome Project, funded by the National Institutes of Health (NIH). Currently, automatically generated gene set are provided for the first four of these genomes and gene builds are in progress for the remaining five. The low sequence coverage of these genomes means that all of the normal problems experienced when predicting gene structures in draft genome assemblies (missing sequence, fragmentation, misassemblies, misplacements, small insertions/deletions/substitutions) are exacerbated. In particular, many genes will be represented only partially (or not at all) in the assembly, and many others (particularly those with large genomic extent) will be found in pieces, distributed across more than one scaffold. The standard Ensembl gene build pipeline (11), which relies on aligning expressed transcript sequences onto the genome sequence, is unsuitable as the main approach for annotating such a low-coverage genome.
To address this, a new gene building methodology for low-coverage genomes has been developed that relies on a whole genome alignment (WGA) to an annotated, reference genome. The WGA underlying each annotated gene structure in the reference genome are used to infer gene-scaffold assemblies of scaffolds in the target genome that contain complete gene structures. WGAs are generated in-house using BLASTz (18) with the resulting set of local alignments processed into a form suitable for the above method using the Axt tools (19). The protein-coding transcripts of the reference gene structures are then projected through the WGA onto the implied gene-scaffolds in the target genome. These projections frequently contain small insertions/deletions with respect to the protein-coding transcript of the reference gene structure. In some cases this raw projection would result in a frame shift in the translation. In the vast majority of such cases this apparent frame shift is most likely the result of a sequence/assembly error in the target genome, resulting from the low sequence coverage. To correct for these apparent errors, the gene build introduces an artificial intron, 1 or 2 bp long, into the predicted gene structure to correct the frame shift at each point where one is introduced. In this way the reading frame of the predicted transcripts are preserved. We refer to these introns as frame-shift introns. When the WGA implies that the sequence contains an internal exon missing from the assembly, and the location is consistent with an intra- or inter-scaffold gap, the exon is placed on the gap sequence. This results in a run of X's of the correct length in the translation. An example of this situation in the elephant (L.africana) genome is shown in Figure 2.
|
This strategy has been developed and piloted on the initial 3x coverage WGS assembly of the cow (Bos taurus) genome. The current cow assembly is 6x coverage and its gene set has been generated using the standard Ensembl pipeline. A new higher quality 7.1x coverage cow assembly, that incorporates sequence data from BAC libraries, has just been released. When the gene build on this assembly is complete it will be possible to carry out a more detailed assessment of the quality and value of gene builds on low-coverage genomes by comparison to standard gene builds on high-quality assemblies. In the mean time some statistics for exon coverage for both the gene builds on the 3x and 2x assemblies are shown in Table 1. For all of these gene builds the human genome has been used as the reference, making side by side comparison possible. Since all these genomes are mammals, it is anticipated that the vast majority of human coding exons should be found in each. The number of exons annotated therefore gives an idea of the quality of gene builds possible given the genome assembly quality. The first two columns of Table 1 show the percentage of base pairs of human exons that can be aligned to the sequence of the target genome by the WGA BLASTz step and the percentage completely missed. These values are a few percentage points less than the theoretical expected percentage coverage for 2x and 3x WGS sequencing of 88% and 95%, respectively, but broadly in line with expectations (20). The last two columns show the equivalent figures for the final gene set after it has been filtered as a result of the gene build process and are significantly lower. In the current gene build process, after raw alignments are chained into gene-scaffolds and the best-in-genome match found, predicted gene structures are removed if >50% of the original exons are missing. These criteria are adopted as in our view where such partial genes are built they contain an unacceptable likelihood of error and are better discarded. Most likely, many of these partial genes result from assembly artifacts; however, this will be better understood after the cow gene set comparisons are carried out. In the mean time it is worth noting the significant difference between the final exon coverage for the initial 3x cow genome and the other four 2x genomes. It appears that dropping from 3x to 2x sequence coverage roughly doubles the fraction of exons missing from resulting gene sets when the filtering we believe necessary to achieve acceptable gene set quality is applied.
|
The code developed for the projection build has also been applied in the gene build for the new chimpanzee (Pan troglodytes) genome assembly, which has greatly improved the gene set. For chimpanzee the gene-scaffold generation step is skipped as the assembly is high coverage.
Improved pipeline for genomes that are evolutionarily distant from main sources of transcript data
The completeness of each gene set depends on the quantity of transcript data either specific for the genome in question, or evolutionarily close enough to be reliably aligned. Although there are a number of large chordate genomes with little organism-specific transcript data, most are mammals and so evolutionarily close to the huge amounts of transcript data from human and mouse. The current major exceptions are chicken (Gallus gallus) and opossum (Monodelphis domestica). Two gene sets have been produced on the chicken genome assembly and been made available though the Ensembl website. For both gene builds the standard Ensembl gene build pipeline was used; however, for the most recent gene build (December 2005) the pipeline was significantly customized to improve gene set quality. Investigations showed that mapping distantly related cDNAs and ESTs onto the chicken genome using the standard pipeline was creating very extended gene structures, incorrectly linking transcripts of some adjacent genes. Gene building on vertebrate sized genomes is very expensive, so the pipeline contains optimizations to reduce the CPU cost. One such strategy is the creation of mini sequences (11) which remove regions thought to be intronic from the final alignment step. In the case of chicken, because of the low similarity of transcripts from other organisms being mapped to the genome sequence, this step was removing some exons and causing gene artefacts in some regions. Removing the mini sequence step and optimizing other parameters resulted in a greatly improved gene set. This process was in part developed in parallel for the opossum genome. Although opossum is nearer to other mammals than chicken, adopting this approach led to similar quality improvements. The downside is a
5-fold increase in the CPU cost of the gene build; however, this has been at least partly offset by optimizations to the genewise program (9) itself which is used for this final alignment step.
Converging to the reference gene set for the high-quality reference genomes of human and mouse
Analysis showing the progressive and significant improvement in the quality of human and mouse gene sets generated by the Ensembl system were presented in past year's report (17). More recently a blind test assessment of gene set quality took place under the ENCODE project (21). The EGASP gene prediction competition (22) was held to compare a variety of automatic gene prediction methods with a curated and experimentally validated human gene set generated by the Sanger Havana group (3) as part of the Gencode consortium (23). The results confirmed the high accuracy of Ensembl gene predictions, with Ensembl ranked as the best or close to the best over a variety of different evaluation criteria. However, the best predicted transcript for each gene still differed to the annotated Gencode reference in 30% of cases. When all annotated alternative transcripts were considered, transcript accuracy was only 4050%. Even allowing for errors in the Gencode set, this is a significant gap. For further details of the evaluation the reader is referred to the special issue of genome biology devoted to evaluation of the EGASP experiment, especially (22). While the EGASP evaluation will lead to some further improvements to the Ensembl gene build system for human and mouse, we recognize that there are likely to be limits on how much more the accuracy of automatic methods on these very high-quality genomes can be improved. Human and mouse already have extensive species-specific experimental data, and there are diminishing returns from each increase in the complexity of the gene build logic to handle remaining hard classes of gene, such as dense clusters of duplicated genes.
For the human genome, Ensembl and Havana have for some time been collaborating as part of the CCDS consortium which includes the RefSeq group at NCBI (2) and the UCSC genome group (1). CCDS is a stable set of protein-coding gene structures for which all consortium members agree on to the base pair. The initial CCDS set contained 13 142 loci which corresponded to
60% of the 22 000 protein-coding loci annotated in Ensembl being accepted (for further details and statistics concerning CCDS sets, see http://www.ncbi.nlm.nih.gov/CCDS/). The CCDS process is being extended to the mouse genome now that the assembly (NCBI build 36) is largely composed of finished sequence. While CCDS only addresses the CDS region of gene structures, the Gencode experience illustrates the value of making greater use of the full Havana curated annotation including UTRs, noncoding transcripts and pseudogenes. As a result the Havana and Ensembl gene build groups are now working towards a single integrated gene set that combines automatic and curated annotation, the first version of which was released in Ensembl 38 (April 2006). In this first iteration 12 000 Havana curated full length protein-coding transcripts were incorporated into Ensembl gene entries directly. Because of the problem of reconciling differences between slightly different annotations that probably represent the same transcript, this initial process led to some transcript duplication. We are working to progressively refine the merging process to address this. The process is providing improvements in both directions. In many cases Havana transcripts are more complete as its possible to set lower alignments thresholds since manual review filters out false positives. However, because it is very slow, manual annotation always risks being out of date, so Ensembl transcripts can also be longer when they have used more recent transcript evidence. Ensembl is also identifying regions where the quality of annotation from the automatic gene build pipeline is low so that they can be prioritized for manual curation by the Havana group. There is still a long way to go in this process as Havana annotation is only available for
50% of the human genome and
20% of the mouse genome; however, it is hoped that with this system of prioritization it should be possible to more rapidly improve the overall quality of human and mouse gene sets.
IDHistoryView and stable identifier discovery
A key utility of all gene sets, regardless of how they are generated, is stability of identifiers between releases. Ensembl manages to map the vast majority of stable identifiers between gene builds; however, changing assemblies, evidence, algorithms and logic inevitably lead to previously predicted genes being absent from a subsequent release or gene splits and merges events. The Ensembl core schema now fully supports an archive of obsolete transcript entries and a new view IDHistoryView has been introduced to allow the history of an identifier to be viewed. Using this interface it should be possible to discover the fate of any Ensembl gene stable identifier back to Ensembl 1.2 (2001) and see how the sequence of the gene structure has changed (the version of transcript and exon stable identifiers is increased if the sequence of that feature changes). The page contains links to Ensembl archive versions to allow users to view older gene structures as they were previously presented.
Variation resources
In the previous report on the Ensembl project (17), major improvements to handle large scale variation data were described. One component of this was a software infrastructure to efficiently store resequencing data. This year we have exploited this system to take advantage of the extensive collection of mouse DNA sequence reads, including those recently released by Celera, data from dbSNP and resequencing data generated by Perlegen Sciences for the US National Institutes of Environmental Health Sciences (NIEHS). These were processed at the Sanger institute using the well-established SNP calling algorithm ssahaSNP (24) to compute >50 million SNPs from common laboratory Mus musculus strains, which were then merged with dbSNP release 126. The resulting data were incorporated into the Ensembl variation schema and a new transcript-centric display TranscriptSNPView was developed to show this variation in a strain-specific way (25). Figure 3 shows an example of this display. TranscriptSNPView is also available in dog (Canis familiaris) Ensembl to provide access to SNPs determined from 16 different strains as part of the dog sequencing project (26). As well as providing an organized view of these data through a web interface, the underlying data storage structure and variation software API makes it easy for bioinformaticians to write custom programs to analysis these complex data. For example, the API makes it possible to view variation from the point of view of any strain with all necessary coordinate transformation being carried out transparently. With resequencing data being generated at an ever increasing rate we anticipate incorporating strain-specific data for a variety of species in the future. Specifically rat (Rattus norvegicus) and chicken (G.gallus) variation data will be available through TranscriptSNPView before the end of 2006.
|
Comparative genomics
The major improvement in comparative genomics has been the June 2006 release switch of the ortholog/paralog prediction pipeline to one based on protein tree calculations from one based on best reciprocal similarity relationships. This is a major change and has required a completely new pipeline, schema, API and display. In the new pipeline maximum-likelihood phylogenetic unrooted gene trees are built using the algorithm PHYML (27) from multiple protein sequence alignments generated using MUSCLE (28) for each gene family containing sequences from all species. Gene families are generated by calculating best reciprocal relationships between translations of all genes followed by single linkage clustering. Finally each gene tree is reconciled with the species tree using the RAL algorithm (29) to call duplication events on internal nodes and to root the tree. The advantage of a gene tree based pipeline is that it is able to identify complex one-to-many and many-to-many relationships between genes resulting from ancient duplication events, unlike best reciprocal methods. The new structure of orthology/paralogy relationships permeate the entire Ensembl site; however, the biggest visual change is the new GeneTreeView display as shown in Figure 4. Predicted gene trees of course suffer issues of the prediction algorithm's parameters not being ideal for all cases and of bad sequence alignments introducing errors, just as does automatic gene annotation: in a proportion of cases the predicted tree will be worse than could be obtained by manual curation. As a result a similar relationship between Ensembl automation and curation is developing for gene trees as for gene sets. Ensembl has started to collaborate with the TreeFam project (30), a curated resource of gene trees, and is currently investigating ways to integrate available curated data into the automatic pipeline.
|
One of the hidden consequences of switching to this more robust predictive model is the ability to use the more reliable implied functional relationships between genes to propagate functional labeling between them. Previously Ensembl genes were described as Known if there was some known functional description attached to supporting organism-specific transcription sequence (such as from UniProt). If the only annotated supporting evidence was from another organism, the gene would be labeled as Novel. From the February 2006 release, a third category of genes was introduced called Known by projection. For these genes, a functional description has been projected from GO terms via the orthology mapping. Although these are predictions, we are confident that the annotation is sufficiently accurate to be very useful.
Web usability and web service integration
Following the major redesign of the website reported past year (17), this has been a year of consolidation for the website, adding completely new views, such as TranscriptSNPView, GeneTreeView and IDHistoryView and working on backend infrastructure that will allow new functionality, such as users logins to be added over the coming year.
A major ongoing issue for the website is improving interactivity. One improvement to at least partly address this is a drag and zoom functionality that has been added to ContigView using JavaScript as shown in Figure 5. This greatly improves the interactivity of the display; however full scrolling functionality, as popularized by sites such as maps.google.com, will require the adoption of new web technologies for which extensive redevelopment is required. As a step towards this some of the information displayed in the popup menus on the ContigView pages are now fetched asynchronously using the AJAX (Asynchronous JavaScript and XML) protocol. Previously, descriptions were not included at all in these popups since it would make the pages too large, so this is added functionality made possible by the new protocols.
|
| FUTURE DIRECTIONS |
|---|
|
|
|---|
There are a number of areas which Ensembl is anticipating over the coming year. The increase in the number of 2x genomes will require continuous development of both gene building and comparative genomics pipelines. In comparative genomics we are also concentrating on providing other clade-specific multiple alignments, starting with the telosts. We will be collaborating with the TreeFam group (30) to provide better gene level comparative genomics across the genomes we handle. We envisage a steady growth of whole genome assays of DNA-binding proteins using techniques such as Chromatin immunoprecipitation on DNA microarrays (ChIP/chip) and other functional studies of genome sequence over the next year. We see ArrayExpress (31) as a natural archive of the experimental results such as ChIP/chip, but Ensembl as the display and integration engine; our goal is to move beyond just the display of the ChIP/chip results towards providing a Regulatory Build integrating appropriate information. Finally we foresee growth in the variation data both in human and in other species, in particular resequencing data. We hope to integrate more resequencing information currently available in the trace archive (http://trace.ensembl.org/, http://www.ncbi.nlm.nih.gov/Traces/) with many of the species in Ensembl and present it in a friendly manner.
| ACKNOWLEDGEMENTS |
|---|
The Ensembl project is principally funded by the Wellcome Trust with additional funding from EMBL, NIHNIAID and BBSRC. We are grateful to Gudmundur Arni Thorisson, to users of our website and to the developers on our mailing lists for much useful feedback and discussion. Funding to pay the Open Access publication charges for this article was provided by the Wellcome Trust.
Conflict of interest statement. None declared.
| REFERENCES |
|---|
|
|
|---|
- Hinrichs, A.S., Karolchik, D., Baertsch, R., Barber, G.P., Bejerano, G., Clawson, H., Diekhans, M., Furey, T.S., Harte, R.A., Hsu, F., et al. (2006) The UCSC Genome Browser Database: update 2006 Nucleic Acids Res, . 34, D590D598
[Abstract/Free Full Text] - Wheeler, D.L., Barrett, T., Benson, D.A., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Edgar, R., Federhen, S., et al. (2006) Database resources of the National Center for Biotechnology Information Nucleic Acids Res, . 34, D173D180
[Abstract/Free Full Text] - Ashurst, J.L., Chen, C.K., Gilbert, J.G., Jekosch, K., Keenan, S., Meidl, P., Searle, S.M., Stalker, J., Storey, R., Trevanion, S., et al. (2005) The Vertebrate Genome Annotation (Vega) database Nucleic Acids Res, . 33, D459D465
[Abstract/Free Full Text] - Wu, C.H., Apweiler, R., Bairoch, A., Natale, D.A., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., et al. (2006) The Universal Protein Resource (UniProt): an expanding universe of protein information Nucleic Acids Res, . 34, D187D191
[Abstract/Free Full Text] - The Honeybee Genome Sequencing Consortium. (2006) Insights into social insects from the genome of the honeybee Apis mellifera Nature, 443, 931949[CrossRef][Medline]
- Schwarz, E.M., Antoshechkin, I., Bastiani, C., Bieri, T., Blasiar, D., Canaran, P., Chan, J., Chen, N., Chen, W.J., Davis, P., et al. (2006) WormBase: better software, richer content Nucleic Acids Res, . 34, D475D478
[Abstract/Free Full Text] - Kasprzyk, A., Keefe, D., Smedley, D., London, D., Spooner, W., Melsopp, C., Hammond, M., Rocca-Serra, P., Cox, T., Birney, E. (2004) EnsMart: a generic system for fast and flexible access to biological data Genome Res, . 14, 160169
[Abstract/Free Full Text] - Birney, E., Andrews, D.T., Bevan, P., Caccamo, M., Chen, Y., Clarke, L., Coates, G., Cuff, J., Curwen, V., Cutts, T., et al. (2004) An overview of Ensembl Genome Res, . 14, 925928
[Abstract/Free Full Text] - Birney, E., Clamp, M., Durbin, R. (2004) GeneWise and genomewise Genome Res, . 14, 988995
[Abstract/Free Full Text] - Cuff, J.A., Coates, G.M., Cutts, T.J., Rae, M. (2004) The Ensembl computing architecture Genome Res, . 14, 971975
[Abstract/Free Full Text] - Curwen, V., Eyras, E., Andrews, T.D., Clarke, L., Mongin, E., Searle, S.M., Clamp, M. (2004) The Ensembl automatic gene annotation system Genome Res, . 14, 942950
[Abstract/Free Full Text] - Eyras, E., Caccamo, M., Curwen, V., Clamp, M. (2004) ESTGenes: alternative splicing from ESTs in Ensembl Genome Res, . 14, 976987
[Abstract/Free Full Text] - Potter, S.C., Clarke, L., Curwen, V., Keenan, S., Mongin, E., Searle, S.M., Stabenau, A., Storey, R., Clamp, M. (2004) The Ensembl analysis pipeline Genome Res, . 14, 934941
[Abstract/Free Full Text] - Searle, S.M., Gilbert, J., Iyer, V., Clamp, M. (2004) The otter annotation system Genome Res, . 14, 963970
[Abstract/Free Full Text] - Stabenau, A., McVicker, G., Melsopp, C., Proctor, G., Clamp, M., Birney, E. (2004) The Ensembl core software libraries Genome Res, . 14, 929933
[Abstract/Free Full Text] - Stalker, J., Gibbins, B., Meidl, P., Smith, J., Spooner, W., Hotz, H.R., Cox, A.V. (2004) The Ensembl website: mechanics of a genome browser Genome Res, . 14, 951955
[Abstract/Free Full Text] - Birney, E., Andrews, D., Caccamo, M., Chen, Y., Clarke, L., Coates, G., Cox, T., Cunningham, F., Curwen, V., Cutts, T., et al. (2006) Ensembl 2006 Nucleic Acids Res, . 34, D556D561
[Abstract/Free Full Text] - Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R.C., Haussler, D., Miller, W. (2003) Humanmouse alignments with BLASTZ Genome Res, . 13, 103107
[Abstract/Free Full Text] - Kent, W.J., Baertsch, R., Hinrichs, A., Miller, W., Haussler, D. (2003) Evolution's cauldron: duplication, deletion and rearrangement in the mouse and human genomes Proc. Natl Acad. Sci. USA, 100, 1148411489
[Abstract/Free Full Text] - Lander, E.S. and Waterman, M.S. (1988) Genomic mapping by fingerprinting random clones: a mathematical analysis Genomics, 2, 231239[CrossRef][Medline]
- The ENCODE Project Consortium. (2004) The ENCODE (ENCyclopedia Of DNA Elements) Project Science, 306, 636640
[Abstract/Free Full Text] - Guigó, R., Flicek, P., Abril, J.F., Reymond, A., Lagarde, J., Denoeud, F., Antonarakis, S., Ashburner, M., Bajic, V.B., Birney, B., et al. (2006) EGASP: the human ENCODE Genome Annotation Assessment Project Genome Biol, . 7, S2
- Harrow, J., Denoeud, F., Frankish, A., Reymond, A., Chen, C.K., Chrast, J., Lagarde, J., Gilbert, J.G., Storey, R., Swarbreck, D., et al. (2006) GENCODE: producing a reference annotation for ENCODE Genome Biol, . 7, Suppl. 1, S4.1S4.9
- Ning, Z., Cox, A.J., Mullikin, J.C. (2001) SSAHA: a fast search method for large DNA databases Genome Res, . 11, 17251729
[Abstract/Free Full Text] - Cunningham, F., Rios, D., Griffiths, M., Smith, J., Ning, Z., Cox, T., Flicek, P., Marin-Garcin, P., Herrero, J., Rogers, J., et al. (2006) TranscriptSNPView: a genome-wide catalog of mouse coding variation Nature Genet, . 38, 853[Web of Science][Medline]
- Lindblad-Toh, K., Wade, C.M., Mikkelsen, T.S., Karlsson, E.K., Jaffe, D.B., Kamal, M., Clamp, M., Chang, J.L., Kulbokas, E.J., III, Zody, M.C., et al. (2005) Genome sequence, comparative analysis and haplotype structure of the domestic dog Nature, 438, 803819[CrossRef][Medline]
- Guindon, S. and Gascuel, O. (2003) A simple, fast and accurate algorithm to estimate large phylogenies by maximum likelihood Syst. Biol, . 52, 696704
[Abstract/Free Full Text] - Edgar, R.C. (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity BMC Bioinformatics, 5, 113[CrossRef][Medline]
- Dufayard, J.F., Duret, L., Penel, S., Gouy, M., Rechenmann, F., Perriere, G. (2005) Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases Bioinformatics, 21, 25962603
[Abstract/Free Full Text] - Li, H., Coghlan, A., Ruan, J., Coin, L.J., Heriche, J.K., Osmotherly, L., Li, R., Liu, T., Zhang, Z., Bolund, L., et al. (2006) TreeFam: a curated database of phylogenetic trees of animal gene families Nucleic Acids Res, . 34, D572D580
[Abstract/Free Full Text] - Brazma, A., Kapushesky, M., Parkinson, H., Sarkans, U., Shojatalab, M. (2006) Data storage and analysis in ArrayExpress Meth. Enzymol, . 411, 370386[Medline]
This article has been cited by other articles:
![]() |
A. Schluter, A. Real-Chicharro, T. Gabaldon, F. Sanchez-Jimenez, and A. Pujol PeroxisomeDB 2.0: an integrative view of the global peroxisomal metabolome Nucleic Acids Res., November 5, 2009; (2009) gkp935v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. L. Krock, I. Mills-Henry, and B. D. Perkins Retrograde Intraflagellar Transport by Cytoplasmic Dynein-2 Is Required for Outer Segment Extension in Vertebrate Photoreceptors but Not Arrestin Translocation Invest. Ophthalmol. Vis. Sci., November 1, 2009; 50(11): 5463 - 5471. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. A. Basta, S. B. Cleveland, R. A. Clinton, A. G. Dimitrov, and M. A. McClure Evolution of Teleost Fish Retroviruses: Characterization of New Retroviruses with Cellular Genes J. Virol., October 1, 2009; 83(19): 10152 - 10162. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-W. Guo, P. Simsa, C. M. Kyama, A. Mihalyi, V. Fulop, E.-E. R. Othman, and T. M. D'Hooghe Reassessing the evidence for the link between dioxin and endometriosis: from molecular biology to clinical epidemiology Mol. Hum. Reprod., October 1, 2009; 15(10): 609 - 624. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Andersson, S. Enroth, A. Rada-Iglesias, C. Wadelius, and J. Komorowski Nucleosomes are well positioned in exons and carry characteristic histone modifications Genome Res., October 1, 2009; 19(10): 1732 - 1741. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. G. Knowles and A. McLysaght Recent de novo origin of human protein-coding genes Genome Res., October 1, 2009; 19(10): 1752 - 1759. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Jimenez-Lozano, J. Segura, J. R. Macias, J. Vega, and J. M. Carazo aGEM: an integrative system for analyzing spatial-temporal gene-expression information Bioinformatics, October 1, 2009; 25(19): 2566 - 2572. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. McFarlan, A. Bonen, and C. G. Guglielmo Seasonal upregulation of fatty acid transporters in flight muscles of migratory white-throated sparrows (Zonotrichia albicollis) J. Exp. Biol., September 15, 2009; 212(18): 2934 - 2940. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. de Smith, C. Purmann, R. G. Walters, R. J. Ellis, S. E. Holder, M. M. Van Haelst, A. F. Brady, U. L. Fairbrother, M. Dattani, J. M. Keogh, et al. A deletion of the HBII-85 class of small nucleolar RNAs (snoRNAs) is associated with hyperphagia, obesity and hypogonadism Hum. Mol. Genet., September 1, 2009; 18(17): 3257 - 3265. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-Q. Shen, B. F. Lang, and G. Burger Diversity and dispersal of a ubiquitous protein family: acyl-CoA dehydrogenases Nucleic Acids Res., September 1, 2009; 37(17): 5619 - 5631. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gonzalez, J. M. Macpherson, and D. A. Petrov A Recent Adaptive Transposable Element Insertion Near Highly Conserved Developmental Loci in Drosophila melanogaster Mol. Biol. Evol., September 1, 2009; 26(9): 1949 - 1961. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schneider, A. Souvorov, N. Sabath, G. Landan, G. H. Gonnet, and D. Graur Estimates of Positive Darwinian Selection Are Inflated by Errors in Sequencing, Annotation, and Alignment Gen Biol Evol, June 22, 2009; 2009(0): 114 - 118. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schneider and G. M. Cannarozzi Support Patterns from Different Outgroups Provide a Strong Phylogenetic Signal Mol. Biol. Evol., June 1, 2009; 26(6): 1259 - 1272. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Sund, S. Roelker, V. Ramachandran, L. Durbin, and D. W. Benson Analysis of Ellis van Creveld syndrome gene products: implications for cardiovascular development and disease Hum. Mol. Genet., May 15, 2009; 18(10): 1813 - 1824. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Copetti, C. Bertoli, E. Dalla, F. Demarchi, and C. Schneider p65/RelA Modulates BECN1 Transcription and Autophagy Mol. Cell. Biol., May 15, 2009; 29(10): 2594 - 2608. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. A. Watkins, A. Gusnanto, B. de Bono, S. De, D. Miranda-Saavedra, D. L. Hardie, W. G. J. Angenent, A. P. Attwood, P. D. Ellis, W. Erber, et al. A HaemAtlas: characterizing gene expression in differentiated human blood cells Blood, May 7, 2009; 113(19): e1 - e9. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. De, S. A. Teichmann, and M. M. Babu The impact of genomic neighborhood on the evolution of human and chimpanzee transcriptome Genome Res., May 1, 2009; 19(5): 785 - 794. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. H. Lee and H. Shatkay An integrative scoring system for ranking SNPs by their potential deleterious effects Bioinformatics, April 15, 2009; 25(8): 1048 - 1055. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Gajecka, U. Radhakrishna, D. Winters, S. K. Nath, M. Rydzanicz, U. Ratnamala, K. Ewing, A. Molinari, J. A. Pitarque, K. Lee, et al. Localization of a Gene for Keratoconus to a 5.6-Mb Interval on 13q32 Invest. Ophthalmol. Vis. Sci., April 1, 2009; 50(4): 1531 - 1539. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Necsulea, C. Guillet, J.-C. Cadoret, M.-N. Prioleau, and L. Duret The Relationship between DNA Replication and Human Genome Organization Mol. Biol. Evol., April 1, 2009; 26(4): 729 - 741. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Hussain, L. R. Saraiva, and S. I. Korsching Positive Darwinian selection and the birth of an olfactory receptor clade in teleosts PNAS, March 17, 2009; 106(11): 4313 - 4318. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. E. Ehrnhoefer, S. L. Butland, M. A. Pouladi, and M. R. Hayden Mouse models of Huntington disease: variations on a theme Dis. Model. Mech., March 1, 2009; 2(3-4): 123 - 129. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Syvanen, O. Lindhe, M. Palner, B. R. Kornum, O. Rahman, B. Langstrom, G. M. Knudsen, and M. Hammarlund-Udenaes Species Differences in Blood-Brain Barrier Transport of Three Positron Emission Tomography Radioligands with Emphasis on P-Glycoprotein Transport Drug Metab. Dispos., March 1, 2009; 37(3): 635 - 643. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Chelala, A. Khan, and N. R Lemoine SNPnexus: a web database for functional annotation of newly discovered and public domain single nucleotide polymorphisms Bioinformatics, March 1, 2009; 25(5): 655 - 661. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Martin, A. Valsesia, A. Telenti, I. Xenarios, and B. J. Stevenson AssociationViewer: a scalable and integrated software tool for visualization of large-scale variation data in genomic context Bioinformatics, March 1, 2009; 25(5): 662 - 663. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. L. Roberts, A. Idris, J. A. Dunn, G. M. Kelly, C. M. Burnton, S. Hodgson, L. L. Hardy, V. Garceau, M. J. Sweet, I. L. Ross, et al. HIN-200 Proteins Regulate Caspase Activation in Response to Foreign Cytoplasmic DNA Science, February 20, 2009; 323(5917): 1057 - 1060. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Wallis Prolactin in the Afrotheria: characterization of genes encoding prolactin in elephant (Loxodonta africana), hyrax (Procavia capensis) and tenrec (Echinops telfairi) J. Endocrinol., February 1, 2009; 200(2): 233 - 240. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Vilella, J. Severin, A. Ureta-Vidal, L. Heng, R. Durbin, and E. Birney EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates Genome Res., February 1, 2009; 19(2): 327 - 335. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Han, K. H. Leung, W. Y. Fung, J. Y. Y. Mak, Y. M. Li, M. K. H. Yap, and S. P. Yip Association of PAX6 Polymorphisms with High Myopia in Han Chinese Nuclear Families Invest. Ophthalmol. Vis. Sci., January 1, 2009; 50(1): 47 - 56. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Diella, S. Chabanis, K. Luck, C. Chica, C. Ramu, C. Nerlov, and T. J. Gibson KEPE--a motif frequently superimposed on sumoylation sites in metazoan chromatin proteins and transcription factors Bioinformatics, January 1, 2009; 25(1): 1 - 5. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. C. Walter, T. Rattei, R. Arnold, U. Guldener, M. Munsterkotter, K. Nenova, G. Kastenmuller, P. Tischler, A. Wolling, A. Volz, et al. PEDANT covers all complete RefSeq genomes Nucleic Acids Res., January 1, 2009; 37(suppl_1): D408 - D411. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. C. Grubb, T. P. Maddatu, C. J. Bult, and M. A. Bogue Mouse Phenome Database Nucleic Acids Res., January 1, 2009; 37(suppl_1): D720 - D730. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. J. P. Hubbard, B. L. Aken, S. Ayling, B. Ballester, K. Beal, E. Bragin, S. Brent, Y. Chen, P. Clapham, L. Clarke, et al. Ensembl 2009 Nucleic Acids Res., January 1, 2009; 37(suppl_1): D690 - D697. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Wilson, R. Pethica, Y. Zhou, C. Talbot, C. Vogel, M. Madera, C. Chothia, and J. Gough SUPERFAMILY--sophisticated comparative genomics, data mining, visualization and phylogeny Nucleic Acids Res., January 1, 2009; 37(suppl_1): D380 - D386. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M. Muro, R. Herrington, S. Janmohamed, C. Frelin, M. A. Andrade-Navarro, and N. N. Iscove Identification of gene 3' ends by automated EST cluster analysis PNAS, December 23, 2008; 105(51): 20286 - 20290. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Kapur, H. Jiang, Y. Xing, and W. H. Wong Cross-hybridization modeling on Affymetrix exon arrays Bioinformatics, December 15, 2008; 24(24): 2887 - 2893. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Bergwitz, M. Bastepe, D. Rendina, G. De Filippo, P. Strazzullo, D. Prie, and G. Friedlander NHERF1 Mutations and Responsiveness of Renal Parathyroid Hormone N. Engl. J. Med., December 11, 2008; 359(24): 2615 - 2617. [Full Text] [PDF] |
||||
![]() |
A. L. Hughes and R. Friedman Genome Size Reduction in the Chicken Has Involved Massive Loss of Ancestral Protein-Coding Genes Mol. Biol. Evol., December 1, 2008; 25(12): 2681 - 2688. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Korneev, E. I. Korneeva, M. A. Lagarkova, S. L. Kiselev, G. Critchley, and M. O'Shea Novel noncoding antisense RNA transcribed from human anti-NOS2A locus is differentially regulated during neuronal differentiation of embryonic stem cells RNA, October 1, 2008; 14(10): 2030 - 2037. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. L. Organ, R. G. Moreno, and S. V. Edwards Three tiers of genome evolution in reptiles Integr. Comp. Biol., October 1, 2008; 48(4): 494 - 504. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Evans, T. De Tomaso, M. Quail, J. Rogers, A. Y. Gracey, A. R. Cossins, and M. Berenbrink Ancient and modern duplication events and the evolution of stearoyl-CoA desaturases in teleost fishes Physiol Genomics, September 17, 2008; 35(1): 18 - 29. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Hellwig and B. L. Bass A starvation-induced noncoding RNA modulates expression of Dicer-regulated genes PNAS, September 2, 2008; 105(35): 12897 - 12902. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Grosso, A. Q. Gomes, N. L. Barbosa-Morais, S. Caldeira, N. P. Thorne, G. Grech, M. von Lindern, and M. Carmo-Fonseca Tissue-specific splicing factor gene expression signatures Nucleic Acids Res., September 1, 2008; 36(15): 4823 - 4832. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Studer, S. Penel, L. Duret, and M. Robinson-Rechavi Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes Genome Res., September 1, 2008; 18(9): 1393 - 1402. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Franck, T. Hulsen, M. A. Huynen, W. W. de Jong, N. H. Lubsen, and O. Madsen Evolution of Closely Linked Gene Pairs in Vertebrate Genomes Mol. Biol. Evol., September 1, 2008; 25(9): 1909 - 1921. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Makino and A. McLysaght Interacting Gene Clusters and the Evolution of the Vertebrate Immune System Mol. Biol. Evol., September 1, 2008; 25(9): 1855 - 1862. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Sultan, M. H. Schulz, H. Richard, A. Magen, A. Klingenhoff, M. Scherf, M. Seifert, T. Borodina, A. Soldatov, D. Parkhomchuk, et al. A Global View of Gene Activity and Alternative Splicing by Deep Sequencing of the Human Transcriptome Science, August 15, 2008; 321(5891): 956 - 960. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Radivojac, P. H. Baenziger, M. G. Kann, M. E. Mort, M. W. Hahn, and S. D. Mooney Gain and loss of phosphorylation sites in human cancer Bioinformatics, August 15, 2008; 24(16): i241 - i247. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. B. Samollow The opossum genome: Insights and opportunities from an alternative mammal Genome Res., August 1, 2008; 18(8): 1199 - 1215. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Polak and P. F. Arndt Transcription induces strand-specific mutations at the 5' end of human genes Genome Res., August 1, 2008; 18(8): 1216 - 1223. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Purdom, K. M. Simpson, M. D. Robinson, J. G. Conboy, A. V. Lapuk, and T.P. Speed FIRMA: a method for detection of alternative splicing from exon array data Bioinformatics, August 1, 2008; 24(15): 1707 - 1714. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Ovcharenko Widespread Ultraconservation Divergence in Primates Mol. Biol. Evol., August 1, 2008; 25(8): 1668 - 1676. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Reimand, L. Tooming, H. Peterson, P. Adler, and J. Vilo GraphWeb: mining heterogeneous biological networks for gene modules with functional significance Nucleic Acids Res., July 1, 2008; 36(suppl_2): W452 - W459. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Romero-Zaliz, C. del Val, J. P. Cobb, and I. Zwir Onto-CC: a web server for identifying Gene Ontology conceptual clusters Nucleic Acids Res., July 1, 2008; 36(suppl_2): W352 - W357. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. V. Smith Browsing HapMap Data Using the Genome Browser CSH Protocols, July 1, 2008; 2008(8): pdb.prot5023 - pdb.prot5023. [Abstract] [Full Text] |
||||
![]() |
A. V. Smith Generating HapMap Data Text Reports Using the Genome Browser CSH Protocols, July 1, 2008; 2008(8): pdb.prot5024 - pdb.prot5024. [Abstract] [Full Text] |
||||
![]() |
A. V. Smith Manipulating HapMap Data Using HaploView CSH Protocols, July 1, 2008; 2008(8): pdb.prot5025 - pdb.prot5025. [Abstract] [Full Text] |
||||
![]() |
A. V. Smith Retrieving HapMap Data Using HapMart CSH Protocols, July 1, 2008; 2008(8): pdb.prot5026 - pdb.prot5026. [Abstract] [Full Text] |
||||
![]() |
A. V. Smith Retrieving HapMap Data via Bulk Download CSH Protocols, July 1, 2008; 2008(8): pdb.prot5027 - pdb.prot5027. [Abstract] [Full Text] |
||||
![]() |
M.-R. Ho, W.-J. Jang, C.-h. Chen, L.-Y. Ch'ang, and W.-c. Lin Designating eukaryotic orthology via processed transcription units Nucleic Acids Res., June 1, 2008; 36(10): 3436 - 3442. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Smeenk, S. J. van Heeringen, M. Koeppel, M. A. van Driel, S. J. J. Bartels, R. C. Akkers, S. Denissov, H. G. Stunnenberg, and M. Lohrum Characterization of genome-wide p53-binding sites upon stress response Nucleic Acids Res., June 1, 2008; 36(11): 3639 - 3654. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Nagase, H. Yamakawa, S. Tadokoro, D. Nakajima, S. Inoue, K. Yamaguchi, Y. Itokawa, R. F. Kikuno, H. Koga, and O. Ohara Exploration of Human ORFeome: High-Throughput Preparation of ORF Clones and Efficient Characterization of Their Protein Products DNA Res, June 1, 2008; 15(3): 137 - 149. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. K. Todd and S. Neidle The relationship of potential G-quadruplex sequences in cis-upstream regions of the human genome to SP1-binding elements Nucleic Acids Res., May 1, 2008; 36(8): 2700 - 2704. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Lloret-Llinares, C. Carre, A. Vaquero, N. de Olano, and F. Azorin Characterization of Drosophila melanogaster JmjC+N histone demethylases Nucleic Acids Res., May 1, 2008; 36(9): 2852 - 2863. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Pierce, R. D. Unwin, C. A. Evans, S. Griffiths, L. Carney, L. Zhang, E. Jaworska, C.-F. Lee, D. Blinco, M. J. Okoniewski, et al. Eight-channel iTRAQ Enables Comparison of the Activity of Six Leukemogenic Tyrosine Kinases Mol. Cell. Proteomics, May 1, 2008; 7(5): 853 - 863. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. D. Snelson, K. Santhakumar, M. E. Halpern, and J. T. Gamse Tbx2b is required for the development of the parapineal organ Development, May 1, 2008; 135(9): 1693 - 1702. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Goudet, S. Mugnier, I. Callebaut, and P. Monget Phylogenetic Analysis and Identification of Pseudogenes Reveal a Progressive Loss of Zona Pellucida Genes During Evolution of Vertebrates Biol Reprod, May 1, 2008; 78(5): 796 - 806. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Hatzis, L. G. van der Flier, M. A. van Driel, V. Guryev, F. Nielsen, S. Denissov, I. J. Nijman, J. Koster, E. E. Santo, W. Welboren, et al. Genome-Wide Pattern of TCF7L2/TCF4 Chromatin Occupancy in Colorectal Cancer Cells Mol. Cell. Biol., April 15, 2008; 28(8): 2732 - 2744. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Y.-L. So, S. B. Cooper, B. J. Feldman, M. Manuchehri, and K. R. Yamamoto Conservation analysis predicts in vivo occupancy of glucocorticoid receptor-binding sequences at glucocorticoid-induced genes PNAS, April 15, 2008; 105(15): 5745 - 5749. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. de Guzman Strong and J. A. Segre Navigating the genome J. Cell Sci., April 1, 2008; 121(7): 921 - 923. [Full Text] [PDF] |
||||
![]() |
K. Bullaughey, M. Przeworski, and G. Coop No effect of recombination on the efficacy of natural selection in primates Genome Res., April 1, 2008; 18(4): 544 - 554. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Ge, K. Zhang, A. C. Need, O. Martin, J. Fellay, T. J. Urban, A. Telenti, and D. B. Goldstein WGAViewer: Software for genomic annotation of whole genome association studies Genome Res., April 1, 2008; 18(4): 640 - 643. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Sherwood, R. Manbodh, C. Sheppard, and A. D. Chalmers RASSF7 Is a Member of a New Family of RAS Association Domain-containing Proteins and Is Required for Completing Mitosis Mol. Biol. Cell, April 1, 2008; 19(4): 1772 - 1782. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Fullwood, J. J. S. Tan, P. W. P. Ng, K. P. Chiu, J. Liu, C. L. Wei, and Y. Ruan The use of multiple displacement amplification to amplify complex DNA libraries Nucleic Acids Res., March 1, 2008; 36(5): e32 - e32. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Struski, L. Mauvieux, C. Gervais, C. Helias, K. L. Liu, and M. Lessard ETV6/GOT1 fusion in a case of t(10;12)(q24;p13)-positive myelodysplastic syndrome Haematologica, March 1, 2008; 93(3): 467 - 468. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. C. Roman, D. A. Benitez, J. M. Carvajal-Gonzalez, and P. M. Fernandez-Salguero Genome-wide B1 retrotransposon binds the transcription factors dioxin receptor and Slug and regulates gene expression in vivo PNAS, February 5, 2008; 105(5): 1632 - 1637. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Birzele, G. Csaba, and R. Zimmer Alternative splicing and protein structure evolution Nucleic Acids Res., February 2, 2008; 36(2): 550 - 558. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Prachumwat and W.-H. Li Gene number expansion and contraction in vertebrate genomes with respect to invertebrate genomes Genome Res., February 1, 2008; 18(2): 221 - 232. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Abeel, Y. Saeys, E. Bonnet, P. Rouze, and Y. Van de Peer Generic eukaryotic core promoter prediction using structural features of DNA Genome Res., February 1, 2008; 18(2): 310 - 323. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Salgado, G. Gimenez, F. Coulier, and C. Marcelle COMPARE, a multi-organism system for cross-species data comparison and transfer of information Bioinformatics, February 1, 2008; 24(3): 447 - 449. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. C. Tzika, R. Helaers, Y. Van de Peer, and M. C. Milinkovitch MANTIS: a phylogenetic framework for multi-species genome comparisons Bioinformatics, January 15, 2008; 24(2): 151 - 157. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Singh, A. Olowoyeye, P. H. Baenziger, J. Dantzer, M. G. Kann, P. Radivojac, R. Heiland, and S. D. Mooney MutDB: update on development of tools for the biochemical analysis of genetic variation Nucleic Acids Res., January 11, 2008; 36(suppl_1): D815 - D819. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. He, S. Chang, J. Zhang, Q. Zhao, H. Xiang, K. Kusonmano, L. Yang, Z. S. Sun, H. Yang, and J. Wang MethyCancer: the database of human DNA methylation and cancer Nucleic Acids Res., January 11, 2008; 36(suppl_1): D836 - D841. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

































