Skip Navigation



Nucleic Acids Research Advance Access published online on October 8, 2008

Nucleic Acids Research, doi:10.1093/nar/gkn709
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (2861K) Freely available
Right arrow Screen PDF (396K) Freely available
Right arrowOA All Versions of this Article:
37/suppl_1/D983    most recent
gkn709v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Bülow, L.
Right arrow Articles by Hehl, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bülow, L.
Right arrow Articles by Hehl, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2008 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Database Issue

AthaMap, integrating transcriptional and post-transcriptional data

Lorenz Bülow1, Stefan Engelmann2, Martin Schindler2 and Reinhard Hehl1,*

1Institut für Genetik,Technische Universität Braunschweig, Spielmannstr. 7, D-38106 Braunschweig and 2Software Systems Engineering Institute, Technische Universität Braunschweig, Mühlenpfordtstr. 23, D-38106 Braunschweig, Germany

*To whom correspondence should be addressed. Tel: +49 531 391 5772; Fax: +49 531 391 5765; Email: r.hehl{at}tu-braunschweig.de

Received September 1, 2008. Revised September 29, 2008. Accepted September 29, 2008.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 ANNOTATION OF GENOMIC POSITIONS...
 UPDATE TO TAIR7
 GRAPHIC DISPLAY OF GENE...
 DATA INCREASE
 FUNDING
 REFERENCES
 
The AthaMap database generates a map of predicted transcription factor binding sites (TFBS) for the whole Arabidopsis thaliana genome. AthaMap has now been extended to include data on post-transcriptional regulation. A total of 403 173 genomic positions of small RNAs have been mapped in the A. thaliana genome. These identify 5772 putative post-transcriptionally regulated target genes. AthaMap tools have been modified to improve the identification of common TFBS in co-regulated genes by subtracting post-transcriptionally regulated genes from such analyses. Furthermore, AthaMap was updated to the TAIR7 genome annotation, a graphic display of gene analysis results was implemented, and the TFBS data content was increased. AthaMap is freely available at http://www.athamap.de/.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 ANNOTATION OF GENOMIC POSITIONS...
 UPDATE TO TAIR7
 GRAPHIC DISPLAY OF GENE...
 DATA INCREASE
 FUNDING
 REFERENCES
 
A large number of different databases are available for database-assisted gene-expression analysis (1). The first level of gene-expression regulation is transcription which is controlled by the synchronized binding of transcription factors (TFs) to adjacent cis-regulatory sequences. The bioinformatic identification of cis-regulatory sequences is an important tool to predict target genes of specific TFs (2). Towards these ends, the AthaMap database was developed. AthaMap is a database that generates a genome-wide map of predicted transcription factor binding sites (TFBS) and cis-regulatory elements for Arabidopsis thaliana (3,4). Compared to similar databases such as AGRIS, Athena and ATTED-II (5–8), AthaMap covers the whole-genome sequence and includes predicted TFBS that were identified with positional weight matrices. Recently, plant-related contents of the transcription and promoter databases TRANSFAC and TRANSPRO (9,10) were integrated with plant proteome and pathway data to the platform BKL Plant (BIOBASE Knowledge library). This was combined with the previously reported ExPlain tool that screens promoter regions with positional weight matrices for TFBS and evaluates results using the ‘Composite Module Analyst’ (CMA) as core component (11,12). This commercial product integrates promoter and pathway analysis of gene-expression data (BIOBASE, Wolfenbüttel, Germany).

In contrast, AthaMap is in the public domain and provides online tools to display TFBS in user-selected genes or at specific genomic positions (3). The detection of combinatorial elements and their target genes allows the prediction of co-regulated genes (13). The gene analysis function detects common TFBS in user-provided genes (14). A short user manual has been published recently (15) and all tools are explained on the ‘Description’ page on the AthaMap website as well. AthaMap has been linked with PathoPlant, a database on plant–pathogen interactions (16). Arabidopsis thaliana microarray experiments in PathoPlant can be screened for co-regulated genes that respond to up to three different stimuli (17). A list of co-regulated genes can directly be exported to AthaMap for identification of common TFBS. However, not all differentially expressed genes are transcriptionally regulated (18). One important factor for post-transcriptional regulation is the expression of small RNAs such as miRNA, siRNA and ta-siRNA (19). Although there are distinct pathways to generate these types of small RNAs, the resulting molecules are very similar in size and represent the small RNA transcriptome of the organism (20). Using a massive parallel sequencing approach, small transcriptome data became available for seedlings and inflorescence tissue of A. thaliana (21). The genome-wide nature of AthaMap and the availability of small RNA data provide a unique opportunity to combine transcriptional and post-transcriptional data in a single database. This may add significantly to the quality of cis-regulatory sequence identification involved in transcriptional regulation.


    ANNOTATION OF GENOMIC POSITIONS OF SMALL RNAS
 TOP
 ABSTRACT
 INTRODUCTION
 ANNOTATION OF GENOMIC POSITIONS...
 UPDATE TO TAIR7
 GRAPHIC DISPLAY OF GENE...
 DATA INCREASE
 FUNDING
 REFERENCES
 
Sequence signatures (17-mers) derived from a small RNA transcriptome analysis of A. thaliana inflorescence tissue and seedlings were used for genomic screenings (21). The complete lists of screening sequences (Accession numbers GSM65747 [NCBI GEO] and GSM65750 [NCBI GEO] ) were downloaded from NCBI's Gene Expression Omnibus (GEO) repository (22). Genomic positions were determined by using a Perl script that screens for occurrences of perfect matches of all 109 590 small RNA 17-mer screening sequences within the five chromosomes of A. thaliana. Absolute positions and orientation of small RNA matches from inflorescence tissue and seedlings were annotated to AthaMap resulting in a total of 403 173 genomic matches. For screening sequences yielding more than one genomic match, corresponding loci were determined. A total of 5772 genes were predicted to be post-transcriptionally regulated by small RNAs since their transcribed regions are targets of at least one small RNA in antisense orientation. A text file with the genome identifiers of the 5772 predicted target genes of small RNAs can be downloaded on the documentation page at AthaMap.

Genomic positions of small RNAs are displayed in AthaMap analogous to TFBSs and are symbolized as xxxxx>. The arrow head gives the orientation of the small RNA. A tool tip box appears when moving over the arrow indicating the absolute genomic position and screening library of the small RNA. Selecting the name adjacent to this symbol will open a new window giving additional information. Figure 1 shows a partial screen shot of position 11 911 on chromosome 1 with a small RNA from the inflorescence library, the tool tip box and the associated pop-up window. This new window shows the screening sequence, corresponding genomic positions for this particular small RNA and the reference.


Figure 1
View larger version (47K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1. Small RNA binding sites in the Arabidopsis thaliana genome. Partial screen shot of the sequence display window with a small RNA binding site at position 11 911 on chromosome 1. The tool tip box indicates the absolute genomic position and screening library. A pop-up window with additional information on the small RNA is also shown.

 
Putative post-transcriptionally regulated genes are identified within the Colocalization and Gene Analysis functions. These genes are tagged on the result pages with an italicized genome identifier. They can be subtracted in the Colocalization and Gene Analysis functions by activating the checkbox ‘exclude genes regulated by smallRNA’ in order to restrict the analyses exclusively to transcriptionally regulated genes.


    UPDATE TO TAIR7
 TOP
 ABSTRACT
 INTRODUCTION
 ANNOTATION OF GENOMIC POSITIONS...
 UPDATE TO TAIR7
 GRAPHIC DISPLAY OF GENE...
 DATA INCREASE
 FUNDING
 REFERENCES
 
The recent publication of the TAIR7 A. thaliana genome release motivated the implementation of this genome annotation into AthaMap (23). The annotation of the gene structure is based on five chromosomal XML flatfiles downloaded from the TAIR web site (release 7). These files were parsed using a Perl script and positional information for 5'- and 3'-UTRs, exons and introns were annotated to AthaMap. These regions are displayed in AthaMap with a colour code similar to the one used by TAIR. Due to the significantly increased number of genes with annotated transcription start site (TSS) in TAIR7, the Gene Analysis and Colocalization functions of AthaMap have been changed to show positions of TFBS relative to TSS of the nearest gene. This applies to 23 222 (73.1%) genes while for the remaining 8540 (26.9%) genes results are still displayed relative to the translation start site. In earlier versions of AthaMap, all positions were shown relative to translation start sites as point of reference. Compared to TAIR5 the previous version annotated to AthaMap, the nucleotide sequence of the A. thaliana genome in TAIR7 was not changed. Therefore, the positional information of all previously determined TFBS remained constant, except for TATA-boxes. Because of the larger number of genes with an annotated TSS, the number of annotated TATA boxes decreased from 16 277 (13) to currently 15 955. The number of TATA boxes decreased because for genes lacking a TSS a larger upstream region was screened for putative TATA boxes than for genes with an annotated TSS (3). Therefore, the lower number of TATA boxes results from elimination of false positives.


    GRAPHIC DISPLAY OF GENE ANALYSIS RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 ANNOTATION OF GENOMIC POSITIONS...
 UPDATE TO TAIR7
 GRAPHIC DISPLAY OF GENE...
 DATA INCREASE
 FUNDING
 REFERENCES
 
The Gene Analysis function of AthaMap generates long lists with positional information on TFBSs in all genes investigated (14). Although overviews or summaries of the data can be displayed, the positional information is difficult to perceive. Therefore, a graphic display of TFBS in the analysed gene region was implemented that enables easy comparison between genes and visual identification of common binding site patterns. Every TF family as well as the small RNAs and combinatorial elements are identified with a different colour and their display can be selected individually. Figure 2 shows the web interface with the buttons to select the TF families and a graphic display of TFBS for selected TF family members in the Arabidopsis genes At2g42530 and At2g42540. Also shown is a tool tip box that opens when the mouse pointer moves over the colour-coded TFBS. The tool tip box gives additional information for the TF that identified this particular TFBS. Factor (RAV1) and factor family (AP2/EREBP) are identified as well as the position relative to the TSS (–70). For TFBS identified with positional weight matrices, threshold score, maximum score and score of the binding site are given (3).


Figure 2
View larger version (57K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2. Graphic display of transcription factor and small RNA binding sites. Partial screen shot of the gene analysis tool with the checkboxes for TF families included in a graphic display and the graphic display of the upstream region of the genes At2g42530 and At2g42540. A tool tip box with additional information on one of the TFBS is also shown.

 

    DATA INCREASE
 TOP
 ABSTRACT
 INTRODUCTION
 ANNOTATION OF GENOMIC POSITIONS...
 UPDATE TO TAIR7
 GRAPHIC DISPLAY OF GENE...
 DATA INCREASE
 FUNDING
 REFERENCES
 
Recently published binding sites for the Arabidopsis TFs TAC1, RAP2.2 and MYB98 were annotated to AthaMap (24–26). These factors belong to the C2H2(Zn), AP2/EREBP and MYB TF families. Detection and annotation of single binding sites was done as described earlier (4). Binding sites for two TFs for which positional weight matrices could be generated were annotated as well. These are the factors STF1 and SPL1 which belong to the bZIP and SBP TF families (27,28). Detection and annotation of matrix-based binding sites was done as described earlier (3). AthaMap now harbours 9 998 736 predicted TFBSs.


    FUNDING
 TOP
 ABSTRACT
 INTRODUCTION
 ANNOTATION OF GENOMIC POSITIONS...
 UPDATE TO TAIR7
 GRAPHIC DISPLAY OF GENE...
 DATA INCREASE
 FUNDING
 REFERENCES
 
German Federal Ministry for Education and Research through GABI-ADVANCIS (BMBF 0315037B). Funding for open access charge: Technical University of Braunschweig.

Conflict of interest statement. None declared.


    ACKNOWLEDGEMENTS
 
We would like to thank Anne-Kareen Blechert for help implementing the TAIR7 genome annotation and for TFBS screenings.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 ANNOTATION OF GENOMIC POSITIONS...
 UPDATE TO TAIR7
 GRAPHIC DISPLAY OF GENE...
 DATA INCREASE
 FUNDING
 REFERENCES
 

  1. Hehl R, Bülow L. Internet resources for gene expression analysis in Arabidopsis thaliana. Curr. Genomics (2008) 9:375–380.[CrossRef]

  2. Hehl R, Wingender E. Database-assisted promoter analysis. Trends in Plant Sci. (2001) 6:251–255.[CrossRef]

  3. Steffens NO, Galuschka C, Schindler M, Bülow L, Hehl R. AthaMap: an online resource for in silico transcription factor binding sites in the Arabidopsis thaliana genome. Nucleic Acids Res. (2004) 32:D368–D372.[Abstract/Free Full Text]

  4. Bülow L, Steffens NO, Galuschka C, Schindler M, Hehl R. AthaMap: from in silico data to real transcription factor binding sites. In Silico Biol. (2006) 6:0023.

  5. Davuluri RV, Sun H, Palaniswamy SK, Matthews N, Molina C, Kurtz M, Grotewold E. AGRIS: Arabidopsis Gene Regulatory Information Server, an information resource of Arabidopsis cis-regulatory elements and transcription factors. BMC Bioinformatics (2003) 4:25.[CrossRef][Medline]

  6. O'Connor TR, Dyreson C, Wyrick JJ. Athena: a resource for rapid visualization and systematic analysis of Arabidopsis promoter sequences. Bioinformatics (2005) 21:4411–4413.[Abstract/Free Full Text]

  7. Palaniswamy SK, James S, Sun H, Lamb RS, Davuluri RV, Grotewold E. AGRIS and AtRegNet. a platform to link cis-regulatory elements and transcription factors into regulatory networks. Plant Physiol. (2006) 140:818–829.[Abstract/Free Full Text]

  8. Obayashi T, Kinoshita K, Nakai K, Shibaoka M, Hayashi S, Saeki M, Shibata D, Saito K, Ohta H. ATTED-II: a database of co-expressed genes and cis elements for identifying co-regulated gene groups in Arabidopsis. Nucleic Acids Res. (2007) 35:D863–D869.[Abstract/Free Full Text]

  9. Matys V, Fricke E, Geffers R, Gossling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV, et al. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. (2003) 31:374–378.[Abstract/Free Full Text]

  10. Chen X, Wu JM, Hornischer K, Kel A, Wingender E. TiProD: the Tissue-specific Promoter Database. Nucleic Acids Res. (2006) 34:D104–D107.[Abstract/Free Full Text]

  11. Kel A, Voss N, Jauregui R, Kel-Margoulis O, Wingender E. Beyond microarrays: finding key transcription factors controlling signal transduction pathways. BMC Bioinformatics (2006) 7(Suppl 2):S13.

  12. Kel A, Konovalova T, Waleev T, Cheremushkin E, Kel-Margoulis O, Wingender E. Composite Module Analyst: a fitness-based tool for identification of transcription factor binding site combinations. Bioinformatics (2006) 22:1190–1197.[Abstract/Free Full Text]

  13. Steffens NO, Galuschka C, Schindler M, Bülow L, Hehl R. AthaMap web tools for database-assisted identification of combinatorial cis-regulatory elements and the display of highly conserved transcription factor binding sites in Arabidopsis thaliana. Nucleic Acids Res. (2005) 33:W397–W402.[Abstract/Free Full Text]

  14. Galuschka C, Schindler M, Bülow L, Hehl R. AthaMap web-tools for the analysis and identification of co-regulated genes. Nucleic Acids Res. (2007) 35:D857–D862.[Abstract/Free Full Text]

  15. Hehl R. The Handbook of Plant Functional Genomics: Concepts and Protocols.—Kahl G, Meksem K, eds. (2008) Weinheim, Germany: Wiley and Sons Ltd. 337–346.

  16. Bülow L, Schindler M, Choi C, Hehl R. PathoPlant®: a database on plant-pathogen interactions. In Silico Biol. (2004) 4:529–536.[Medline]

  17. Bülow L, Schindler M, Hehl R. PathoPlant®: a platform for microarray expression data to analyze co-regulated genes involved in plant defense responses. Nucleic Acids Res. (2007) 35:D841–D845.[Abstract/Free Full Text]

  18. Cheadle C, Fan J, Cho-Chung YS, Werner T, Ray J, Do L, Gorospe M, Becker KG. Control of gene expression during T cell activation: alternate regulation of mRNA transcription and mRNA stability. BMC Genomics (2005) 6:75.[CrossRef][Medline]

  19. Jones-Rhoades MW, Bartel DP, Bartel B. MicroRNAS and their regulatory roles in plants. Annu. Rev. Plant Biol. (2006) 57:19–53.[CrossRef][Medline]

  20. Vaucheret H. Post-transcriptional small RNA pathways in plants: mechanisms and regulations. Genes Dev. (2006) 20:759–771.[Abstract/Free Full Text]

  21. Lu C, Tej SS, Luo S, Haudenschild CD, Meyers BC, Green PJ. Elucidation of the small RNA component of the transcriptome. Science (2005) 309:1567–1569.[Abstract/Free Full Text]

  22. Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R. NCBI GEO: mining tens of millions of expression profiles—database and tools update. Nucleic Acids Res. (2007) 35:D760–D765.[Abstract/Free Full Text]

  23. Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. (2008) 36:D1009–D1014.[Abstract/Free Full Text]

  24. Ren S, Mandadi KK, Boedeker AL, Rathore KS, McKnight TD. Regulation of telomerase in Arabidopsis by BT2, an apparent target of TELOMERASE ACTIVATOR1. Plant Cell (2007) 19:23–31.[Abstract/Free Full Text]

  25. Welsch R, Maass D, Voegel T, Dellapenna D, Beyer P. Transcription factor RAP2.2 and its interacting partner SINAT2: stable elements in the carotenogenesis of Arabidopsis leaves. Plant Physiol. (2007) 145:1073–1085.[Abstract/Free Full Text]

  26. Punwani JA, Rabiger DS, Drews GN. MYB98 positively regulates a battery of synergid-expressed genes encoding filiform apparatus localized proteins. Plant Cell (2007) 19:2557–2568.[Abstract/Free Full Text]

  27. Song YH, Yoo CM, Hong AP, Kim SH, Jeong HJ, Shin SY, Kim HJ, Yun DJ, Lim CO, Bahk JD, et al. DNA-binding study identifies C-box and hybrid C/G-box or C/A-box motifs as high-affinity binding sites for STF1 and LONG HYPOCOTYL5 proteins. Plant Physiol. (2008) 146:1862–1877.[Abstract/Free Full Text]

  28. Liang X, Nazarenus TJ, Stone JM. Identification of a consensus DNA-binding site for the Arabidopsis thaliana SBP domain transcription factor, AtSPL14, and binding kinetics by surface plasmon resonance. Biochemistry (2008) 47:3645–3653.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Plant CellHome page
C. Lelandais-Briere, L. Naya, E. Sallet, F. Calenge, F. Frugier, C. Hartmann, J. Gouzy, and M. Crespi
Genome-Wide Medicago truncatula Small RNA Analysis Revealed Novel MicroRNAs and Isoforms Differentially Regulated in Roots and Nodules
PLANT CELL, September 1, 2009; 21(9): 2780 - 2796.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (2861K) Freely available
Right arrow Screen PDF (396K) Freely available
Right arrowOA All Versions of this Article:
37/suppl_1/D983    most recent
gkn709v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Bülow, L.
Right arrow Articles by Hehl, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bülow, L.
Right arrow Articles by Hehl, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?