Nucleic Acids Research, 2001, Vol. 29, No. 1 159-164
© 2001 Oxford University Press
The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species
The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA
Received October 12, 2000; Accepted October 17, 2000.
| ABSTRACT |
|---|
|
|
|---|
While genome sequencing projects are advancing rapidly, EST sequencing and analysis remains a primary research tool for the identification and categorization of gene sequences in a wide variety of species and an important resource for annotation of genomic sequence. The TIGR Gene Indices (http://www.tigr.org/tdb/tgi.shtml) are a collection of species-specific databases that use a highly refined protocol to analyze EST sequences in an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed by first clustering, then assembling EST and annotated gene sequences from GenBank for the targeted species. This process produces a set of unique, high-fidelity virtual transcripts, or Tentative Consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, to provide links between orthologous and paralogous genes and as a resource for comparative sequence analysis.
| INTRODUCTION |
|---|
|
|
|---|
The sequencing of eukaryotic genomes is progressing at an astonishing rate. The fruit fly, Drosophila melanogaster, was published in the spring of 2000, Arabidopsis thaliana, a plant model organism, has recently been completed, a draft-quality human sequence is now available, and the sequencing of mouse, rat and rice are well under way. However, for many organisms of scientific, economic or agricultural interest, complete genomic sequencing is unlikely to be completed in the foreseeable future and the sequencing of Expressed Sequence Tags (ESTs) (1) remains the primary tool for genomic exploration and for functional genomics projects. There are nearly 5 000 000 ESTs in GenBank (nearly half of which are human), and the number of species represented by 50 000 or more ESTs has increased dramatically in the past year (http://www.ncbi.nlm.nih.gov/dbEST/dbEST_summary.html).
Even for completed genomes, EST data remains a crucial tool for gene identification, genomic annotation and comparative genomics. Regardless of how ESTs are ultimately used, their value can be significantly enhanced if the data are used to reconstruct a high-fidelity set of non-redundant transcripts. There are a number of publicly available databases that attempt to provide such analysis for some species, including UniGene (2) and STACK (3). However, the TIGR Gene Indices (4) are unique in the number of species surveyed, in the approach used to construct the individual species-specific databases and in the manner in which they can be used.
The TIGR Gene Indices provide an analysis for humans, experimental models of human disease such as mouse and rat, valuable crop plants and other important experimental organisms sampled extensively by EST sequencing. TIGR Gene Indices are maintained for 21 species, including the 15 most heavily sampled organisms, potato and five parasitic eukaryotes that are currently the subject of genomic sequencing projects. The current state of EST sequencing and a summary of the currently available TIGR Gene Indices can be found in Table 1.
|
To create the TIGR Gene Indices, we have developed a highly refined, rigorously tested protocol for cleaning, clustering and assembling ESTs and gene sequences to produce high-fidelity consensus sequences for the represented genes while eliminating low quality, misclustered or chimaeric sequences (5). This has several advantages over competing approaches: it separates closely related genes into distinct consensus sequences, it separates splice variants and it produces longer representations of the underlying gene sequences. The resulting Tentative Consensus (TC) sequences can be used for eukaryotic genome sequence annotation (6,7), integration of complex mapping data and identification of orthologous genes.
| CONSTRUCTION OF THE GENE INDICES |
|---|
|
|
|---|
Each Gene Index is assembled using an identical process. For each species, EST sequences are downloaded from dbEST and trimmed to remove remove vector, polyA/T tails, adaptor sequences and contaminating bacterial sequences. Gene sequences (NP sequences) are parsed through Entrez from CDS and CDS-join features in GenBank records; additional Expressed Transcript (ET) sequences are obtained from the TIGR EGAD database (http://www.tigr.org/tdb/egad/egad.html).
EST and gene sequences are then compared using FLAST, a rapid sequence comparison program based on DDS (8), in which query sequences are first concatenated and then searched against a nucleotide database. Sequences sharing a minimum of 95% identity over a 40 nt or longer region with <20 bases of mismatched sequence at either end are grouped into a cluster. Each cluster is then assembled separately. For each cluster, component EST, NP and ET sequences are downloaded and these sequences are then assembled using CAP3 (9) to produce TCs. Assembly produces one or more consensus sequence for each cluster and rejects any chimeric, low-quality and non-overlapping sequences. Each cluster is assembled in the same fashion until the entire set of clusters has been exhausted. The resulting set of TCs is loaded into the appropriate species-specific Gene Index database for annotation.
Following assembly, TCs are annotated to provide a provisional functional assignment. A TC containing a known gene is assigned the function of that gene; TCs without assigned functions are searched using DPS (8) against a non-redundant protein database; high-scoring hits are assigned a putative function. For the Human, Mouse and Rat Gene Indices, mapping locations are assigned by using e-PCR (10). The resulting Gene Index is released through the TIGR web site (http://www.tigr.org/tbd/tdb.html); an example THC from the Human Gene Index is shown in Figure 1.
|
Gene Indices can be searched by TC number, the GenBank Accession number of any EST contained within the dataset or any ET used to build the Index. Users can perform a tissue-based search in which the library information in EST records is used to generate an electronic northern blot, identifying the tissue-specificity of expression based on the relative EST abundance. DNA and protein sequences can also be used to search the Gene Indices using WU-BLAST (http://www.tigr.org/cgi-bin/BlastSearch/blast_tgi.cgi), a gapped BLAST program developed by Warren Gish (Washington University, St Louis, MO).
The TIGR Gene Indices and the component TC assemblies are maintained within Sybase relational databases that allow versioning and heritability to be maintained. Each time a new version of the database is created, novel assemblies, caused by either the joining or splitting of previous TCs, are assigned a new, unique TC identifier. Previously-used identifiers are never reused and information regarding previous assemblies is never lost. Database queries using a TC identifier from a previous build return the most current version of that assembly. This allows assemblies to evolve as more data are available while providing tracking from build to build and maintaining functional assignments across multiple releases.
| IDENTIFICATION OF ORTHOLOGS AND PARALOGS |
|---|
|
|
|---|
The pending completion of the sequence of human and Arabidopsis genomes represent significant scientific achievements and sets the stage for the sequencing of other plant and animal genomes, including mouse, rat and rice. These data promise unprecedented opportunities for functional and evolutionary studies, including the identification and functional annotation of genes and non-coding regulatory regions. The utility of such analysis depends on the identification of homologous genes across species and the integration of data from a wide range of organisms. Homologous genes can be separated into two classes, orthologs and paralogs (11). Orthologs are homologous genes that perform the same biological function in different species but that have diverged in sequence due to evolutionary separation; paralogs are homologous genes within a species that are the result of a gene duplication event within the lineage. The study of orthologs is of particular importance because it is assumed that these genes play similar developmental or physiological roles and, consequently, should share conserved functional and regulatory domains.
While genome sequencing will provide a significant quantity of data, for many species, ESTs provide the primary source of gene sequence data. We have developed two separate approaches to the identification and representation of orthologous gene sequence data: the TIGR Orthologous Gene Alignment (TOGA) database and sequence-based genome alignments.
The TOGA database was introduced in January 2000 and represents the first attempt to identify orthologs using the gene and EST sequence resources. At present, TOGA is divided into separate sections for mammals and plants; the mammalian section consists of orthologs from human, mouse, rat and cattle while the plant section includes Arabidopsis, rice, tomato, potato, Medicago, soybean and maize. While the comparison of millions of ESTs from these species represents a significant computational challenge, this task is vastly simplified through the use of the TCs that comprise the TIGR Gene Indices.
For each species to be included in TOGA, the TCs contained within the respective Gene Indices are compared pairwise. Tentative Ortholog Groups (TOGs) are identified by requiring reciprocal best hits across three or more species with a minimum of 75% identity over a length of 400 bp or more for any single sequence match. High-scoring hits that did not meet the reciprocal best hit criteria, but which matched members of existing TOGs using the same criteria, were classified as Tentative Paralogs. Using these criteria, 8300 TOGs were identified containing TCs from three or more of the four mammalian species and 3074 from among the eight plant species surveyed. The distribution of species represented in TOGA is summarized in Table 2. An example mammalian TOG can be seen in Figure 2.
|
|
Like the TIGR Gene Indices, TOGA is a relational database that maintains the TOGs as accessionable objects that can be tracked across subsequent releases. TOGs can be searched using either a name-based search that allows users to enter a gene name and look for approximate matches or using a WU-BLAST (12) to search the dataset. TOGA can be found at http://www.tigr.org/tdb/toga/toga.shtml. More information regarding WU-BLAST can be found at http://blast.wustl.edu.
Additional interspecies information can be gained by examining the alignment of the EST and gene sequence data in the TIGR Gene Indices with reference to plant and animal genomes. Using the completed genome of Arabidopsis we tabulated the alignment of the TCs from the various TIGR Plant Gene Indices with the chromosomal sequence (http://www.tigr.org/tdb/at/alignTC.html). An example alignment in the region of an annotated gene on Chromosome II can be seen in Figure 3. We have completed a similar analysis using the recently published sequences of the long arms of human chromosomes 21 and 22.
|
| USING THE TIGR GENE INDICES |
|---|
|
|
|---|
Effective use of genomic resources for functional, comparative and evolutionary studies will rely on developing an accurate catalog of the genes encoded within each species as well as tools for cross-referencing various genomes of interest. The TIGR Gene Indices and the TOGA database represent an effort to provide such a resource by first attempting to identify and annotate the genes in a variety of organisms and then providing mechanisms to link to candidate orthologs in other species.
There are a variety of means by which a user might gain entry to the TIGR Gene Indices. For example, the radiation hybrid mapping data allows users to search for TC sequences that map to a candidate genomic region. Other users may search for TCs that appear to be expressed in a tissue-specific fashion or that contain ESTs from a particular disease state. However, the most common entry point for most users is the sequence search page (http://www.tigr.org/cgi-bin/BlastSearch/blast_tgi.cgi). Both BLASTN and TBLASTN versions of the WU-BLAST package have been implemented allowing DNA and protein queries to be used. Alignments to high scoring TCs and singleton ESTs in the organism searched are returned and users can view the appropriate sequence by clicking on the TC number or EST ID brings the user to an appropriate display of the sequence, similar to that in Figure 1. These TCs can be used to identify TOGs in the TOGA database or to search the genomic sequence alignments.
In addition to the Web interface, the TIGR Gene Indices are available as flat files. The TC consensus sequences are provided in a FASTA format file; the ESTs comprising each TC are specified in a separate file. Many users involved in the annotation of genomic sequence and in analysis of cDNA microarray data have found these to be particularly useful.
| CONCLUSIONS |
|---|
|
|
|---|
An increasing number of species are being subjected to genomic analysis, rapidly increasing the pace of gene discovery and accelerating functional genomics applications. For most species, EST sequencing remains the primary method of genomic sequence analysis. The TIGR Gene Indices, which represent the most comprehensive, publicly available analysis of EST sequences, have expanded significantly in the past year, adding 10 additional species-specific databases. In addition, we have expanded both the scope and utility of the resources we provide for cross-species comparisons through the TOGA database and genomic sequence alignments.
The TIGR Gene Indices have proven invaluable for annotation of genomic sequence and for functional analysis of ESTs. They are available via a free license for academic and non-profit use; commercial licenses are available for a fee. Parties interested in obtaining a license should visit http://www.tigr.org/tdb/license.html or email license{at}tigr.org.
| ACKNOWLEDGEMENTS |
|---|
The authors are indebted to A. Glodek for database development. The authors also wish to thank M. Heaney and S. Lo for database support, and V. Sapiro, B. Lee, S. Gregory, R. Kramchedu, C. Irwin, M. Sengamalay and E. Arnold for computer system support. This work was supported by the US Department of Energy, grant DE-FG02-99ER62852 and the US National Science Foundation, grant DBI-9983070. Additional support was provided by the US National Science Foundation through grants DBI-9813392 and DBI-9975866. J.Q. was supported in part by NSF grant KDI-9980088.
| FOOTNOTES |
|---|
* To whom correspondence should be addressed. Tel: +1 301 838 3528; +1 301 838 0208; Email: johnq{at}tigr.org Present address: Feng Liang, Life Technologies, Rockville, MD 20850, USA
| REFERENCES |
|---|
|
|
|---|
-
1 Adams,M.D., Kelley,J.M., Gocayne,J.D., Dubnick,M., Polymeropoulos,M.H.M., Xiao,H., Merril,C.R., Wu,A., Olde,B., Moreno,R.F. et al. (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science, 252, 16511656.
2 Boguski,M.S. and Schuler,G.D. (1995) ESTablishing a human transcript map. Nature Genet., 10, 369371.[Web of Science][Medline]
3 Burke,J., Wang,H., Hide,W. and Davison,D.B. (1998) Alternative gene form discovery and candidate gene selection from gene indexing projects. Genome Res., 8, 276290.
4 Quackenbush,J., Liang,F., Holt,I., Pertea,G. and Upton,J. (2000). The TIGR Gene Indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res., 28, 141145.
5 Liang,F., Holt,I., Pertea,G., Karamycheva,S., Salzberg,S.L. and Quackenbush,J. (2000) An optimized protocol for analysis of EST sequences. Nucleic Acids Res., 28, 36573665.
6 Lin,X., Kaul,S., Rounsley,S., Shea,T.P., Benito,M.I., Town,C.D., Fujii,C.Y., Mason,T., Bowman,C.L., Barnstead,M. et al. (1999) Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature, 402, 761768.[Medline]
7 Liang,F., Holt,I., Pertea,G., Karamycheva,S., Salzberg,S.L. and Quackenbush,J. (2000) Gene index analysis of the human genome estimates approximately 120, 000 genes. Nature Genet., 25, 239240.[Web of Science][Medline]
8 Huang,X., Adams,M.D., Zhou,H. and Kerlavage,A.R. (1997) A Tool for Analyzing and Annotating Genomic Sequence. Genomics, 46, 3745.[Web of Science][Medline]
9 Huang,X. and Madan,A. (1999) CAP3: A DNA sequence assembly program. Genome Res., 9, 868877.
10 Schuler,G.D. (1997) Sequence mapping by electronic PCR. Genome Res., 7, 541550.
11 Fitch,W.M. (1970) Distinguishing homologous from analogous proteins. Syst. Zool., 19, 99113.
12 Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403410.[Web of Science][Medline]
This article has been cited by other articles:
![]() |
H. Yamaguchi, H. Fukuoka, T. Arao, A. Ohyama, T. Nunome, K. Miyatake, and S. Negoro Gene expression analysis in cadmium-stressed roots of a low cadmium-accumulating solanaceous plant, Solanum torvum J. Exp. Bot., January 1, 2010; 61(2): 423 - 437. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Bragg and G. Stone k-link EST clustering: evaluating error introduced by chimeric sequences under different degrees of linkage Bioinformatics, September 15, 2009; 25(18): 2302 - 2308. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Arai, M. Hayashi, and M. Nishimura Proteomic Identification and Characterization of a Novel Peroxisomal Adenine Nucleotide Transporter Supplying ATP for Fatty Acid {beta}-Oxidation in Soybean and Arabidopsis PLANT CELL, December 1, 2008; 20(12): 3227 - 3240. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Evans, T. De Tomaso, M. Quail, J. Rogers, A. Y. Gracey, A. R. Cossins, and M. Berenbrink Ancient and modern duplication events and the evolution of stearoyl-CoA desaturases in teleost fishes Physiol Genomics, September 17, 2008; 35(1): 18 - 29. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Garrett-Mayer, G. Parmigiani, X. Zhong, L. Cope, and E. Gabrielson Cross-study validation and combined analysis of gene expression microarray data Biostat., April 1, 2008; 9(2): 333 - 354. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. D. Morin, G. Aksay, E. Dolgosheina, H. A. Ebhardt, V. Magrini, E. R. Mardis, S. C. Sahinalp, and P. J. Unrau Comparative analysis of the small RNA transcriptomes of Pinus contorta and Oryza sativa Genome Res., April 1, 2008; 18(4): 571 - 584. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Arai, M. Hayashi, and M. Nishimura Proteomic Analysis of Highly Purified Peroxisomes from Etiolated Soybean Cotyledons Plant Cell Physiol., April 1, 2008; 49(4): 526 - 539. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Nagel, L. K. Culley, Y. Lu, E. Liu, P. D. Matthews, J. F. Stevens, and J. E. Page EST Analysis of Hop Glandular Trichomes Identifies an O-Methyltransferase That Catalyzes the Biosynthesis of Xanthohumol PLANT CELL, January 1, 2008; 20(1): 186 - 200. [Abstract] [Full Text] [PDF] |
||||
![]() |
C Espinoza, C Medina, S Somerville, and P Arce-Johnson Senescence-associated genes induced during compatible viral interactions with grapevine and Arabidopsis J. Exp. Bot., September 4, 2007; (2007) erm165v2. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Liang, G. Wang, L. Liu, G. Ji, Y. Liu, J. Chen, J. S. Webb, G. Reese, and J. F. D. Dean WebTraceMiner: a web service for processing and mining EST sequence trace files Nucleic Acids Res., July 13, 2007; 35(suppl_2): W137 - W142. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Liu, C. Vitte, J. Ma, A. A. Mahama, T. Dhliwayo, M. Lee, and J. L. Bennetzen A GeneTrek analysis of the maize genome PNAS, July 10, 2007; 104(28): 11844 - 11849. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Hernandez, M. Ramirez, O. Valdes-Lopez, M. Tesfaye, M. A. Graham, T. Czechowski, A. Schlereth, M. Wandrey, A. Erban, F. Cheung, et al. Phosphorus Stress in Common Bean: Root Transcript and Metabolic Responses Plant Physiology, June 1, 2007; 144(2): 752 - 767. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Leymarie, E. Bruneaux, S. Gibot-Leclerc, and F. Corbineau Identification of transcripts potentially involved in barley seed germination and dormancy using cDNA-AFLP J. Exp. Bot., February 1, 2007; 58(3): 425 - 437. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Childs, J. P. Hamilton, W. Zhu, E. Ly, F. Cheung, H. Wu, P. D. Rabinowicz, C. D. Town, C. R. Buell, and A. P. Chan The TIGR Plant Transcript Assemblies database Nucleic Acids Res., January 12, 2007; 35(suppl_1): D846 - D851. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ouyang, W. Zhu, J. Hamilton, H. Lin, M. Campbell, K. Childs, F. Thibaud-Nissen, R. L. Malek, Y. Lee, L. Zheng, et al. The TIGR Rice Genome Annotation Resource: improvements and new features Nucleic Acids Res., January 12, 2007; 35(suppl_1): D883 - D887. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Rampey, A. W. Woodward, B. N. Hobbs, M. P. Tierney, B. Lahner, D. E. Salt, and B. Bartel An Arabidopsis Basic Helix-Loop-Helix Leucine Zipper Protein Modulates Metal Homeostasis and Auxin Conjugate Responsiveness Genetics, December 1, 2006; 174(4): 1841 - 1857. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Xu, Y. Zhang, L. Kang, M. J. Roossinck, and K. S. Mysore Computational Estimation and Experimental Verification of Off-Target Silencing during Posttranscriptional Gene Silencing in Plants Plant Physiology, October 1, 2006; 142(2): 429 - 440. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Fredslund, L. H. Madsen, B. K. Hougaard, N. Sandal, J. Stougaard, D. Bertioli, and L. Schauser GeMprospector--online design of cross-species genetic marker candidates in legumes and grasses. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W670 - W675. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Su, J. Wang, J. Yu, X. Huang, and X. Gu Evolution of alternative splicing after gene duplication Genome Res., February 1, 2006; 16(2): 182 - 189. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. P. Lohar, N. Sharopova, G. Endre, S. Penuela, D. Samac, C. Town, K. A.T. Silverstein, and K. A. VandenBosch Transcript Analysis of Early Nodulation Events in Medicago truncatula Plant Physiology, January 1, 2006; 140(1): 221 - 234. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. K. Zolman, M. Monroe-Augustus, I. D. Silva, and B. Bartel Identification and Functional Characterization of Arabidopsis PEROXIN4 and the Interacting Protein PEROXIN22 PLANT CELL, December 1, 2005; 17(12): 3422 - 3435. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. W. Zhou, B. F. C. Kafsack, R. N. Cole, P. Beckett, R. F. Shen, and V. B. Carruthers The Opportunistic Pathogen Toxoplasma gondii Deploys a Diverse Legion of Invasion and Survival Proteins J. Biol. Chem., October 7, 2005; 280(40): 34233 - 34244. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Goes da Silva, A. Iandolino, F. Al-Kayal, M. C. Bohlmann, M. A. Cushman, H. Lim, A. Ergul, R. Figueroa, E. K. Kabuloglu, C. Osborne, et al. Characterizing the Grape Transcriptome. Analysis of Expressed Sequence Tags from Multiple Vitis Species and Development of a Compendium of Gene Expression during Berry Development Plant Physiology, October 1, 2005; 139(2): 574 - 597. [Abstract] [Full Text] [PDF] |
||||
![]() |
The Rice Chromosome 3 Sequencing Consortium Sequence, annotation, and analysis of synteny between rice chromosome 3 and diverged grass species Genome Res., September 1, 2005; 15(9): 1284 - 1291. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Djebbari, S. Karamycheva, E. Howe, and J. Quackenbush MeSHer: identifying biological concepts in microarray assays based on PubMed references and MeSH terms Bioinformatics, August 1, 2005; 21(15): 3324 - 3326. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Fredslund, L. Schauser, L. H. Madsen, N. Sandal, and J. Stougaard PriFi: using a multiple alignment of related sequences to find primers for amplification of homologs Nucleic Acids Res., July 1, 2005; 33(suppl_2): W516 - W520. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. A.T. Silverstein, M. A. Graham, T. D. Paape, and K. A. VandenBosch Genome Organization of More Than 300 Defensin-Like Genes in Arabidopsis Plant Physiology, June 1, 2005; 138(2): 600 - 610. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Sharov, D. B. Dudekula, and M. S.H. Ko Genome-wide assembly and analysis of alternative transcripts in mouse Genome Res., May 1, 2005; 15(5): 748 - 754. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Yuan, S. Ouyang, A. Wang, W. Zhu, R. Maiti, H. Lin, J. Hamilton, B. Haas, R. Sultana, F. Cheung, et al. The Institute for Genomic Research Osa1 Rice Genome Annotation Database Plant Physiology, May 1, 2005; 138(1): 18 - 26. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Jantasuriyarat, M. Gowda, K. Haller, J. Hatfield, G. Lu, E. Stahlberg, B. Zhou, H. Li, H. Kim, Y. Yu, et al. Large-Scale Identification of Expressed Sequence Tags Involved in Rice and Rice Blast Fungus Interaction Plant Physiology, May 1, 2005; 138(1): 105 - 115. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Kim, S. Shin, and S. Lee ECgene: Genome-based EST clustering and gene modeling for alternative splicing Genome Res., April 1, 2005; 15(4): 566 - 576. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ramirez, M. A. Graham, L. Blanco-Lopez, S. Silvente, A. Medrano-Soto, M. W. Blair, G. Hernandez, C. P. Vance, and M. Lara Sequencing and Analysis of Common Bean ESTs. Building a Foundation for Functional Genomics Plant Physiology, April 1, 2005; 137(4): 1211 - 1227. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Huang, J. Pumphrey, and A. R. Gingle ESTminer: a Web interface for mining EST contig and cluster databases Bioinformatics, March 1, 2005; 21(5): 669 - 670. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Blomberg, E. L. Long, T. S. Sonstegard, C. P. Van Tassell, J. R. Dobrinsky, and K. A. Zuelke Serial analysis of gene expression during elongation of the peri-implantation porcine trophectoderm (conceptus) Physiol Genomics, January 20, 2005; 20(2): 188 - 194. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lee, J. Tsai, S. Sunkara, S. Karamycheva, G. Pertea, R. Sultana, V. Antonescu, A. Chan, F. Cheung, and J. Quackenbush The TIGR Gene Indices: clustering and assembling EST and known genes and integration with eukaryotic genomes Nucleic Acids Res., January 1, 2005; 33(suppl_1): D71 - D74. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Hubbard, D. V. Grafham, K. J. Beattie, I. M. Overton, S. R. McLaren, M. D.R. Croning, P. E. Boardman, J. K. Bonfield, J. Burnside, R. M. Davies, et al. Transcriptome analysis for the chicken based on 19,626 finished cDNA sequences and 485,337 expressed sequence tags Genome Res., January 1, 2005; 15(1): 174 - 183. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. de la Cruz, S. Bromberg, D. Pasko, M. Shimoyama, S. Twigger, J. Chen, C.-F. Chen, C. Fan, C. Foote, G. R. Gopinath, et al. The Rat Genome Database (RGD): developments towards a phenome database Nucleic Acids Res., January 1, 2005; 33(suppl_1): D485 - D491. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rudd openSputnik--a database to ESTablish comparative plant genomics using unsaturated sequence collections Nucleic Acids Res., January 1, 2005; 33(suppl_1): D622 - D627. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Vandepoele and Y. Van de Peer Exploring the Plant Transcriptome through Phylogenetic Profiling Plant Physiology, January 1, 2005; 137(1): 31 - 42. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Whitworth, G. K. Springer, L. J. Forrester, W. G. Spollen, J. Ries, W. R. Lamberson, N. Bivens, C. N. Murphy, N. Mathialigan, J. A. Green, et al. Developmental Expression of 2489 Gene Clusters During Pig Embryogenesis: An Expressed Sequence Tag Project Biol Reprod, October 1, 2004; 71(4): 1230 - 1243. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Leipzig, P. Pevzner, and S. Heber The Alternative Splicing Gallery (ASG): bridging the gap between genome and transcriptome Nucleic Acids Res., August 3, 2004; 32(13): 3977 - 3983. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Mitra, S. L. Shaw, and S. R. Long Six nonnodulating plant mutants defective for Nod factor-induced transcriptional changes associated with the legume-rhizobia symbiosis PNAS, July 6, 2004; 101(27): 10217 - 10222. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. M. Roche, K. Hokamp, M. Acab, L. A. Babiuk, R. E. W. Hancock, and F. S. L. Brinkman ProbeLynx: a tool for updating the association of microarray probes to genes Nucleic Acids Res., July 1, 2004; 32(suppl_2): W471 - W474. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Graham, K. A.T. Silverstein, S. B. Cannon, and K. A. VandenBosch Computational Identification and Characterization of Novel Genes from Legumes Plant Physiology, July 1, 2004; 135(3): 1179 - 1197. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. LeClere, R. A. Rampey, and B. Bartel IAR4, a Gene Required for Auxin Conjugate Sensitivity in Arabidopsis, Encodes a Pyruvate Dehydrogenase E1{alpha} Homolog Plant Physiology, June 1, 2004; 135(2): 989 - 999. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Xing, A. Resch, and C. Lee The Multiassembly Problem: Reconstructing Multiple Transcript Isoforms From EST Fragment Mixtures Genome Res., March 1, 2004; 14(3): 426 - 441. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Vincentz, F. A.A. Cara, V. K. Okura, F. R. da Silva, G. L. Pedrosa, A. S. Hemerly, A. N. Capella, M. Marins, P. C. Ferreira, S. C. Franca, et al. Evaluation of Monocot and Eudicot Divergence Using the Sugarcane Transcriptome Plant Physiology, March 1, 2004; 134(3): 951 - 959. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. W. Klee, D. F. Carlson, S. C. Fahrenkrug, S. C. Ekker, and L. B. M. Ellis Identifying secretomes in people, pufferfish and pigs Nucleic Acids Res., February 27, 2004; 32(4): 1414 - 1421. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Mitreva, J. P. McCarter, J. Martin, M. Dante, T. Wylie, B. Chiapelli, D. Pape, S. W. Clifton, T. B. Nutman, and R. H. Waterston Comparative Genomics of Gene Expression in the Parasitic and Free-Living Nematodes Strongyloides stercoralis and Caenorhabditis elegans Genome Res., February 1, 2004; 14(2): 209 - 220. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. C. Carey, J. T. Strahle, D. A. Selinger, and V. L. Chandler Mutations in the pale aleurone color1 Regulatory Gene of the Zea mays Anthocyanin Pathway Have Distinct Phenotypes Relative to the Functionally Similar TRANSPARENT TESTA GLABRA1 Gene in Arabidopsis thaliana PLANT CELL, February 1, 2004; 16(2): 450 - 464. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Mitra and S. R. Long Plant and Bacterial Symbiotic Mutants Define Three Transcriptionally Distinct Stages in the Development of the Medicago truncatula/Sinorhizobium meliloti Symbiosis Plant Physiology, February 1, 2004; 134(2): 595 - 604. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Allen, M. Pertea, and S. L. Salzberg Computational Gene Prediction Using Multiple Sources of Evidence Genome Res., January 1, 2004; 14(1): 142 - 148. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Uenishi, T. Eguchi, K. Suzuki, T. Sawazaki, D. Toki, H. Shinkai, N. Okumura, N. Hamasima, and T. Awata PEDE (Pig EST Data Explorer): construction of a database for ESTs derived from porcine full-length cDNA libraries Nucleic Acids Res., January 1, 2004; 32(90001): D484 - 488. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. P. Hill, D. A. Begley, J. H. Finger, T. F. Hayamizu, I. J. McCright, C. M. Smith, J. S. Beal, L. E. Corbani, J. A. Blake, J. T. Eppig, et al. The mouse Gene Expression Database (GXD): updates and enhancements Nucleic Acids Res., January 1, 2004; 32(90001): D568 - 571. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Whitelaw, W. B. Barbazuk, G. Pertea, A. P. Chan, F. Cheung, Y. Lee, L. Zheng, S. van Heeringen, S. Karamycheva, J. L. Bennetzen, et al. Enrichment of Gene-Coding Sequences in Maize by Genome Filtration Science, December 19, 2003; 302(5653): 2118 - 2120. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Monroe-Augustus, B. K. Zolman, and B. Bartel IBR5, a Dual-Specificity Phosphatase-Like Protein Modulating Auxin and Abscisic Acid Responsiveness in Arabidopsis PLANT CELL, December 1, 2003; 15(12): 2979 - 2991. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. M. Christensen, Z. Vejlupkova, Y. K. Sharma, K. M. Arthur, J. W. Spatafora, C. A. Albright, R. B. Meeley, J. P. Duvick, R. S. Quatrano, and J. E. Fowler Conserved Subgroups and Developmental Regulation in the Monocot rop Gene Family Plant Physiology, December 1, 2003; 133(4): 1791 - 1808. [Abstract] [Full Text] |
||||
![]() |
M.-T. Navarro-Gochicoa, S. Camut, A. C.J. Timmers, A. Niebel, C. Herve, E. Boutet, J.-J. Bono, A. Imberty, and J. V. Cullimore Characterization of Four Lectin-Like Receptor Kinases Expressed in Roots of Medicago truncatula. Structure, Location, Regulation of Expression, and Potential Role in the Symbiosis with Sinorhizobium meliloti Plant Physiology, December 1, 2003; 133(4): 1893 - 1910. [Abstract] [Full Text] |
||||
![]() |
B. J. Haas, A. L. Delcher, S. M. Mount, J. R. Wortman, R. K. Smith Jr, L. I. Hannick, R. Maiti, C. M. Ronning, D. B. Rusch, C. D. Town, et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies Nucleic Acids Res., October 1, 2003; 31(19): 5654 - 5666. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Jiang, M. Zhang, V. D. Wasem, J. J. Michal, H. Zhang, and R. W. Wright Jr Census of Genes Expressed in Porcine Embryos and Reproductive Tissues by Mining an Expressed Sequence Tag Database Based on Human Genes Biol Reprod, October 1, 2003; 69(4): 1177 - 1182. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Li, C. J. Stoeckert Jr., and D. S. Roos OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes Genome Res., September 1, 2003; 13(9): 2178 - 2189. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. L. Long, T. S. Sonstegard, J. A. Long, C. P. Van Tassell, and K. A. Zuelke Serial Analysis of Gene Expression in Turkey Sperm Storage Tubules in the Presence and Absence of Resident Sperm Biol Reprod, August 1, 2003; 69(2): 469 - 474. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Brusic, R. S. Pillai, D. G. Silva, N. Petrovsky, RIKEN GER Group, GSL Members, and C. Schonbach Cytokine-Related Genes Identified From the RIKEN Full-Length Mouse cDNA Data Set Genome Res., June 1, 2003; 13(6): 1307 - 1317. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kasukawa, M. Furuno, I. Nikaido, H. Bono, D. A. Hume, C. Bult, D. P. Hill, R. Baldarelli, J. Gough, A. Kanapin, et al. Development and Evaluation of an Automated Annotation Pipeline and cDNA Annotation System Genome Res., June 1, 2003; 13(6): 1542 - 1551. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhu, S. D. Schlueter, and V. Brendel Refined Annotation of the Arabidopsis Genome by Complete Expressed Sequence Tag Mapping Plant Physiology, June 1, 2003; 132(2): 469 - 484. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Mergaert, K. Nikovics, Z. Kelemen, N. Maunoury, D. Vaubert, A. Kondorosi, and E. Kondorosi A Novel Family in Medicago truncatula Consisting of More Than 300 Nodule-Specific Genes Coding for Small, Secreted Polypeptides with Conserved Cysteine Motifs Plant Physiology, May 1, 2003; 132(1): 161 - 173. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-T. Navarro-Gochicoa, S. Camut, A. Niebel, and J. V. Cullimore Expression of the Apyrase-Like APY1 Genes in Roots of Medicago truncatula Is Induced Rapidly and Transiently by Stress and Not by Sinorhizobium meliloti or Nod Factors Plant Physiology, March 1, 2003; 131(3): 1124 - 1136. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Sorek and H. M. Safer A novel algorithm for computational identification of contaminated EST libraries Nucleic Acids Res., February 1, 2003; 31(3): 1067 - 1074. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Ronning, S. S. Stegalkina, R. A. Ascenzi, O. Bougri, A. L. Hart, T. R. Utterbach, S. E. Vanaken, S. B. Riedmuller, J. A. White, J. Cho, et al. Comparative Analyses of Potato Expressed Sequence Tag Libraries Plant Physiology, February 1, 2003; 131(2): 419 - 429. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rudd, H.-W. Mewes, and K. F.X. Mayer Sputnik: a database platform for comparative plant genomics Nucleic Acids Res., January 1, 2003; 31(1): 128 - 132. [Abstract] [Full Text] [PDF] |
||||
![]() |
E.-P. Journet, D. van Tuinen, J. Gouzy, H. Crespeau, V. Carreau, M.-J. Farmer, A. Niebel, T. Schiex, O. Jaillon, O. Chatagnier, et al. Exploring root symbiotic programs in the model legume Medicago truncatula using EST analysis Nucleic Acids Res., December 15, 2002; 30(24): 5579 - 5592. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. S. Tanaka, T. Kunath, W. L. Kimber, S. A. Jaradat, C. A. Stagg, M. Usuda, T. Yokota, H. Niwa, J. Rossant, and M. S.H. Ko Gene Expression Profiling of Embryo-Derived Stem Cells Reveals Candidate Genes Associated With Pluripotency and Lineage Specificity Genome Res., December 1, 2002; 12(12): 1921 - 1928. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. H. Ware, P. Jaiswal, J. Ni, I. V. Yap, X. Pan, K. Y. Clark, L. Teytelman, S. C. Schmidt, W. Zhao, K. Chang, et al. Gramene, a Tool for Grass Genomics Plant Physiology, December 1, 2002; 130(4): 1606 - 1613. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Ayoubi, X. Jin, S. Leite, X. Liu, J. Martajaja, A. Abduraham, Q. Wan, W. Yan, E. Misawa, and R. A. Prade PipeOnline 2.0: automated EST processing and functional data sorting Nucleic Acids Res., November 1, 2002; 30(21): 4761 - 4769. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Bourdon, F. Naef, P. H. Rao, V. Reuter, S. C. Mok, G. J. Bosl, S. Koul, V. V. V. S. Murty, R. S. Kucherlapati, and R. S. K. Chaganti Genomic and Expression Analysis of the 12p11-p12 Amplicon Using EST Arrays Identifies Two Novel Amplified and Overexpressed Genes Cancer Res., November 1, 2002; 62(21): 6218 - 6223. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Rep, H. L. Dekker, J. H. Vossen, A. D. de Boer, P. M. Houterman, D. Speijer, J. W. Back, C. G. de Koster, and B. J.C. Cornelissen Mass Spectrometric Identification of Isoforms of PR Proteins in Xylem Sap of Fungus-Infected Tomato Plant Physiology, October 1, 2002; 130(2): 904 - 917. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Osato, M. Itoh, H. Konno, S. Kondo, K. Shibata, P. Carninci, T. Shiraki, A. Shinagawa, T. Arakawa, S. Kikuchi, et al. A Computer-Based Method of Selecting Clones for a Full-Length cDNA Project: Simultaneous Collection of Negligibly Redundant and Variant cDNAs Genome Res., July 1, 2002; 12(7): 1127 - 1134. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lee, R. Sultana, G. Pertea, J. Cho, S. Karamycheva, J. Tsai, B. Parvizi, F. Cheung, V. Antonescu, J. White, et al. Cross-Referencing Eukaryotic Genomes: TIGR Orthologous Gene Alignments (TOGA) Genome Res., March 1, 2002; 12(3): 493 - 502. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Skrabanek and F. Campagne TissueInfo: high-throughput identification of tissue expression profiles and specificity Nucleic Acids Res., November 1, 2001; 29(21): e102 - e102. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Hegde, R. Qi, R. Gaspard, K. Abernathy, S. Dharap, J. Earle-Hughes, C. Gay, N. U. Nwokekeh, T. Chen, A. I. Saeed, et al. Identification of Tumor Markers in Models of Human Colorectal Cancer Using a 19,200-Element Complementary DNA Microarray Cancer Res., November 1, 2001; 61(21): 7792 - 7797. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

















