Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (245K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Suzuki, Y.
Right arrow Articles by Nakai, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Suzuki, Y.
Right arrow Articles by Nakai, K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2004, Vol. 32, Database issue D78-D81
© 2004 Oxford University Press

DBTSS, DataBase of Transcriptional Start Sites: progress report 2004

Yutaka Suzuki*,1, Riu Yamashita1,2, Sumio Sugano1 and Kenta Nakai1,2

1 Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan, and 2 Undergraduate Program for Bioinformatics and Systems Biology, Faculty of Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan

*To whom correspondence should be addressed. Tel: +81 3 5449 5343; Fax: +81 3 5449 5416; Email: ysuzuki{at}ims.u-tokyo.ac.jp
+BP192706–BP383670

Received September 15, 2003; Revised and Accepted October 1, 2003

DDBJ/EMBL/GenBank accession nos+.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES
 PROMOTER COMPARISON AND SEARCH...
 REFERENCES
 
DBTSS (http://dbtss.hgc.jp) was originally constructed based on a collection of experimentally determined TSSs of human genes. Since its first release in 2002, it has been updated several times. First, the amount of stored data has increased significantly: e.g. the number of clones that match both the RefSeq mRNA set and the genome sequence has increased from 111 382 to 190 964, now covering 11 234 genes. Second, the positions of SNPs in dbSNP were displayed on the upstream regions of contained human genes. Third, DBTSS now covers other species such as mouse and the human malaria parasite. It will become a central database containing data for many more species with oligo-capping and related methods. Lastly, the database now serves for comparative promoter analyses: in the current version, comparative views of potentially orthologous promoters from human and mouse are presented with an additional function of searching potential transcription-factor binding sites, which are either conserved or diverged between species.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES
 PROMOTER COMPARISON AND SEARCH...
 REFERENCES
 
The knowledge of exact transcriptional start sites (TSSs) of genes is valuable in many ways: it makes the prediction of translational start sites more accurate; it can be used for exploring sequence determinants of TSSs; and it makes the analysis of upstream regulatory regions (promoters) more precise. In principle, information of a TSS is obtained by mapping the corresponding transcript onto the genome sequence. Nevertheless, it is widely known that many mRNA sequence data stored in public databases, lack information about their 5' ends because of the difficulty in obtaining full-length cDNAs. Thus, even after the completion of human genome sequencing, it is not easy to locate TSSs systematically. To overcome this problem, we have developed a method to construct full-length enriched cDNA libraries using a cap selection technique, the oligo-capping method, and have been systematically collecting full-length cDNA data with this method [(1); T.Ota et al. submitted]. Initial computational characterization of human TSSs has been carried out (2,3) and a database [DataBase of Transcriptional Start Sites (DBTSS)] containing the TSS information of 7889 human genes has been constructed (4). In this report, we summarize the updates of DBTSS since its first release, including its new departure as a basis of comparative promoter analyses.


    NEW FEATURES
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES
 PROMOTER COMPARISON AND SEARCH...
 REFERENCES
 
Compared with its initial version, the current DBTSS (version 3) has been upgraded in at least five ways. First, the number of processed one-pass human cDNA clones has increased significantly (from 217 402 to 400 225). Since one of the important findings from our TSS analysis was that the TSS position of a gene is not always fixed but rather often fluctuates for ~50 bp on average (3), the distribution of TSS positions should become clearer as the number of mapped cDNA clones increases. As always, we constructed a so-called RefFull sequence set (11 234 sequences) by extending the 5'-end sequences of RefSeq mRNA sequences (5), if necessary. On average, 6042 sequences were extended by 71.6 bp. At the genomic level, the average difference between 5'-ends of two data sets becomes 4396 bp because of internal introns. Thus, it is clear that our data make promoter analysis of human genes much easier. For more details of the statistics of the DBTSS, see the Statistics section of the DBTSS web page.

Second, to facilitate promoter analysis of human genes, we mapped the positions of single nucleotide polymorphisms (SNPs) stored in a public database, dbSNP (5), on the –1000:+200 region of each representative TSS for each human gene (a sample output is shown in Fig. 1). These SNPs are candidates of functional regulatory SNPs (rSNPs) that affect the promoter activity. We also plan to add SNP data from other sources. In DBTSS, it is also possible to enlist the name of genes located within a specified distance from each SNP.



View larger version (39K):
[in this window]
[in a new window]
 
Figure 1. Example of the output of a human gene including the correspondence with a mouse gene, gene position in the chromosome, comparison with Ensembl and RefSeq data, SNP positions and graphical representations of one-path cDNA clones.

 
The third, and probably the most important, upgrade of DBTSS is that it now supports data from multiple species. To date, we have constructed many full-length cDNA libraries of various species upon requests from many researchers. In addition, large-scale collections of cDNAs determined using a related method by Yoshihide Hayashizaki’s group are also publicly available (6,7). In the current version, we added the data of 2490 clones of Plasmodium falciparum, the human malaria parasite (8) and 580 209 full-length cDNA sequences of Mus musculus (7). The number of Ref-full members for mouse is 6875 (for more details, see Y.Suzuki et al., submitted). We will add data for other species whenever we get the agreement. They include data for Caenorhabditis elegans, chimpanzee, macaque, Cyanidioschyzon melorae (unicellular red alga), zebrafish and sorghum.

The remaining two novel features will be explained in the next section.


    PROMOTER COMPARISON AND SEARCH OF CIS-ELEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES
 PROMOTER COMPARISON AND SEARCH...
 REFERENCES
 
The fourth novel feature of the DBTSS (version 3) is that it provides users with comparative views of human and mouse promoters that are probably orthologous. The potentially orthologous gene set was obtained from the LocusLink database (5) and our own sequence comparison. As a result, promoters of 3324 gene pairs can now be displayed. In each pair, locally similar sequence segments were detected by a local alignment program, LALIGN (9) and their correspondences are shown graphically (Fig. 2).




View larger version (41K):
[in this window]
[in a new window]
 
Figure 2. A comparative view of human and mouse promoters. (a) Global view with potential transcription factor binding sites. Locally similar sequence segments are shown in boxes and the corresponding boxes are represented by the same number (e.g. ‘0’). (b) More detailed view around the corresponding TSSs.

 
The fifth novel feature is a function for locating positions similar to known transcription-factor binding sites, which are stored in the TRANSFAC database (10). More specifically, we support TRANSFAC Public-based search (for searches using TRANSFAC Professional, which is a commercial version, users should follow its condition of use, which are shown in our web page). To reduce the number of potentially spurious hits, users can choose various levels of cut-off values and target regions/strands. Moreover, it is also possible to restrict hits within conserved regions between the two species. It is also possible for users to enlist gene names that specify combinations of the above conditions: e.g. genes that harbor both potential binding sites of factors A and B on their upstream regions could be selected with arbitrary cut-off values. With this function, the DBTSS can now be regarded as a platform of systematic promoter analyses.

DBTSS is available at http://dbtss.hgc.jp/ and will continue to expand, incorporating our in-house data and others.


    ACKNOWLEDGEMENTS
 
We thank T. Hasui, K. Abe, M. Morinaga, M. Ishizawa, M. Kawamura, T. Mizuno, A. Kanai and H. Hata for technical support; J. Mizushima-Sugano and E. Nakajima for helpful discussion; Y. Hayashizaki for permission to incorporate their mouse data into DBTSS; and E. Wingender and A. Kel for enabling TRANSFAC-based search. This study was supported by a Grant-in-Aid for Scientific Research on Priority Areas and by special coordination funds for promoting science and technology (SCF), both from the Ministry of Education, Culture, Sports, Science and Technology in Japan.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 NEW FEATURES
 PROMOTER COMPARISON AND SEARCH...
 REFERENCES
 

  1. Suzuki,Y. and Sugano,S. (2003) Construction of a full-length enriched and a 5'-end enriched cDNA library using the oligo-capping method. Methods Mol. Biol., 221, 73–91.[Medline]

  2. Suzuki,Y., Tsunoda,T., Sese,J., Taira,H., Mizushima-Sugano,J., Hata,H., Ota,T., Isogai,T., Tanaka,T., Nakamura,Y. et al. (2001) Identification and characterization of the potential promoter regions of 1031 kinds of human genes. Genome Res., 11, 677–684.[Abstract/Free Full Text]

  3. Suzuki,Y., Taira,H., Tsunoda,T., Mizushima-Sugano,J., Sese,J., Hata,H., Ota,T., Isogai,T., Tanaka,T., Morishita,S. et al. (2001) Diverse transcriptional initiation revealed by fine, large-scale mapping of mRNA start sites. EMBO Rep., 2, 388–393.[CrossRef][ISI][Medline]

  4. Suzuki,Y. Yamashita,R., Nakai,K. and Sugano S. (2002) DBTSS: DataBase of human transcriptional start sites and full-length cDNAs. Nucleic Acids Res., 30, 328–331.[Abstract/Free Full Text]

  5. Wheeler,D.L., Church,D.M., Federhen,S., Lash,A.E., Madden,T.L., Pontius,J.U., Schuler,G.D., Schriml,L.M., Sequeira,E., Tatusova,T.A. and Wagner,L. (2003) Database resources of the National Center for Biotechnology. Nucleic Acids Res., 31, 28–33.[Abstract/Free Full Text]

  6. Carninci,P. and Hayashizaki,Y. (1999) High-efficiency full-length cDNA cloning. Methods Enzymol., 303, 19–44.[ISI][Medline]

  7. The FANTOM consortium and the RIKEN Genome Exploration Research Group Phase I & II Team (2002) Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature, 420, 563–573.[CrossRef][Medline]

  8. Watanabe,J., Sasaki,M., Suzuki,Y. and Sugano,S. (2002) Analysis of transcriptomes of human malaria parasite Plasmodium falciparum using full-length enriched library: identification of novel genes and diverse transcription start sites of messenger RNAs. Gene, 291, 105–113.[CrossRef][ISI][Medline]

  9. Huang,X.Q., Hardison,R.C. and Miller,W. (1990) A space-efficient algorithm for local similarities. Comput. Appl. Biosci., 16, 373–381.

  10. Matys,V., Fricke,E., Geffers,R., Gossling,E., Haubrock,M., Hehl,R., Hornischer,K., Karas,D., Kel,A. E,, Kel-Margoulis,O.V. et al. (2003) TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res., 31, 374–378.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
N.-O. Chimge, A. V. Makeyev, F. H. Ruddle, and D. Bayarsaihan
Identification of the TFII-I family target genes in the vertebrate genome
PNAS, July 1, 2008; 105(26): 9006 - 9010.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Yamashita, Y. Suzuki, N. Takeuchi, H. Wakaguri, T. Ueda, S. Sugano, and K. Nakai
Comprehensive detection of human terminal oligo-pyrimidine (TOP) genes and analysis of their characteristics
Nucleic Acids Res., June 1, 2008; 36(11): 3707 - 3715.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
M. O. Hoque, M. S. Kim, K. L. Ostrow, J. Liu, G. B. A. Wisman, H. L. Park, M. L. Poeta, C. Jeronimo, R. Henrique, A. Lendvai, et al.
Genome-Wide Promoter Analysis Uncovers Portions of the Cancer Methylome
Cancer Res., April 15, 2008; 68(8): 2661 - 2670.
[Abstract] [Full Text] [PDF]


Home page
J. Cell Biol.Home page
I. Roxrud, C. Raiborg, N. M. Pedersen, E. Stang, and H. Stenmark
An endosomally localized isoform of Eps15 interacts with Hrs to mediate degradation of epidermal growth factor receptor
J. Cell Biol., March 24, 2008; 180(6): 1205 - 1218.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. C. Frith, E. Valen, A. Krogh, Y. Hayashizaki, P. Carninci, and A. Sandelin
A code for transcription initiation in mammalian genomes
Genome Res., January 1, 2008; 18(1): 1 - 12.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. D. Schmid, T. Sengstag, P. Bucher, and M. Delorenzi
MADAP, a flexible clustering tool for the interpretation of one-dimensional genome annotation data
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W201 - W205.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Vardhanabhuti, J. Wang, and S. Hannenhalli
Position and distance specificity are important determinants of cis-regulatory motifs in addition to evolutionary conservation
Nucleic Acids Res., May 11, 2007; 35(10): 3203 - 3213.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
A. P. W. Funnell, C. A. Maloney, L. J. Thompson, J. Keys, M. Tallack, A. C. Perkins, and M. Crossley
Erythroid Kruppel-Like Factor Directly Activates the Basic Kruppel-Like Factor Gene in Erythroid Cells
Mol. Cell. Biol., April 1, 2007; 27(7): 2777 - 2790.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
Y. Tokusumi, Y. Ma, X. Song, R. H. Jacobson, and S. Takada
The New Core Promoter Element XCPE1 (X Core Promoter Element 1) Directs Activator-, Mediator-, and TATA-Binding Protein-Dependent but TFIID-Independent RNA Polymerase II Transcription from TATA-Less Promoters
Mol. Cell. Biol., March 1, 2007; 27(5): 1844 - 1858.
[Abstract] [Full Text] [PDF]


Home page
BloodHome page
W. Y. I. Chan, G. A. Follows, G. Lacaud, J. E. Pimanda, J.-R. Landry, S. Kinston, K. Knezevic, S. Piltz, I. J. Donaldson, L. Gambardella, et al.
The paralogous hematopoietic regulators Lyl1 and Scl are coregulated by Ets and GATA factors, but Lyl1 cannot rescue the early Scl-/- phenotype
Blood, March 1, 2007; 109(5): 1908 - 1916.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. E. Vinogradov
'Genome design' model and multicellular complexity: golden middle
Nucleic Acids Res., November 6, 2006; 34(20): 5906 - 5914.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Luykx, I. V. Bajic, and S. Khuri
NXSensor web tool for evaluating DNA for nucleosome exclusion sequences and accessibility to binding factors.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W560 - W565.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
S. J. Cooper, N. D. Trinklein, E. D. Anton, L. Nguyen, and R. M. Myers
Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome
Genome Res., January 1, 2006; 16(1): 1 - 10.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
K. Kimura, A. Wakamatsu, Y. Suzuki, T. Ota, T. Nishikawa, R. Yamashita, J.-i. Yamamoto, M. Sekine, K. Tsuritani, H. Wakaguri, et al.
Diversification of transcriptional modulation: Large-scale identification and characterization of putative alternative promoters of human genes
Genome Res., January 1, 2006; 16(1): 55 - 65.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. Blanco, D. Farre, M. M. Alba, X. Messeguer, and R. Guigo
ABS: a database of Annotated regulatory Binding Sites from orthologous promoters
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D63 - D67.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. D. Schmid, R. Perier, V. Praz, and P. Bucher
EPD in its twentieth year: towards complete promoter coverage of selected model organisms
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D82 - D85.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Yamashita, Y. Suzuki, H. Wakaguri, K. Tsuritani, K. Nakai, and S. Sugano
DBTSS: DataBase of Human Transcription Start Sites, progress report 2006
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D86 - D89.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Sun, S. K. Palaniswamy, T. T. Pohar, V. X. Jin, T. H.-M. Huang, and R. V. Davuluri
MPromDb: an integrated resource for annotation and visualization of mammalian gene promoters and ChIP-chip experimental data
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D98 - D103.
[Abstract] [Full Text] [PDF]


Home page
DNA ResHome page
A. A. Sharov, D. B. Dudekula, and M. S. H. Ko
CisView: A Browser and Database of cis-regulatory Modules Predicted in the Mouse Genome
DNA Res, January 1, 2006; 13(3): 123 - 134.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
X. Chen, J.-m. Wu, K. Hornischer, A. Kel, and E. Wingender
TiProD: the Tissue-specific Promoter Database
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D104 - D107.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Kawaji, T. Kasukawa, S. Fukuda, S. Katayama, C. Kai, J. Kawai, P. Carninci, and Y. Hayashizaki
CAGE Basic/Analysis Databases: the CAGE resource for comprehensive promoter analysis
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D632 - D636.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. P. Lee, K. Howcroft, A. Kotekar, H. H. Yang, K. H. Buetow, and D. S. Singer
ATG deserts define a novel core promoter subclass
Genome Res., September 1, 2005; 15(9): 1189 - 1197.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Florquin, Y. Saeys, S. Degroeve, P. Rouze, and Y. Van de Peer
Large-scale structural analysis of the core promoter in mammalian and plant genomes
Nucleic Acids Res., July 27, 2005; 33(13): 4255 - 4264.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Morinaga, A. Enomoto, Y. Shimono, F. Hirose, N. Fukuda, A. Dambara, M. Jijiwa, K. Kawai, K. Hashimoto, M. Ichihara, et al.
GDNF-inducible zinc finger protein 1 is a sequence-specific transcriptional repressor that binds to the HOXA10 gene regulatory region
Nucleic Acids Res., July 26, 2005; 33(13): 4191 - 4201.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
S. Kamalakaran, S. K. Radhakrishnan, and W. T. Beck
Identification of Estrogen-responsive Genes Using a Genome-wide Analysis of Promoter Elements for Transcription Factor Binding Sites
J. Biol. Chem., June 3, 2005; 280(22): 21491 - 21497.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
T. H. Kim, L. O. Barrera, C. Qu, S. Van Calcar, N. D. Trinklein, S. J. Cooper, R. M. Luna, C. K. Glass, M. G. Rosenfeld, R. M. Myers, et al.
Direct isolation and identification of promoters in the human genome
Genome Res., June 1, 2005; 15(6): 830 - 839.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
S. Desagher, D. Severac, A. Lipkin, C. Bernis, W. Ritchie, A. Le Digarcher, and L. Journot
Genes Regulated in Neurons Undergoing Transcription-dependent Apoptosis Belong to Signaling Pathways Rather than the Apoptotic Machinery
J. Biol. Chem., February 18, 2005; 280(7): 5693 - 5702.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Zhao, Z. Xuan, L. Liu, and M. Q. Zhang
TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D103 - D107.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Kasai, S.-i. Hashimoto, T. Yamada, J. Sese, S. Sugano, K. Matsushima, and S. Morishita
5'SAGE: 5'-end Serial Analysis of Gene Expression database
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D550 - D552.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Tang, S. L. Tan, S. K. Ramadoss, A. P. Kumar, M.-H. E. Tang, and V. B. Bajic
Computational method for discovery of estrogen responsive genes
Nucleic Acids Res., December 1, 2004; 32(21): 6212 - 6217.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (245K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Suzuki, Y.
Right arrow Articles by Nakai, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Suzuki, Y.
Right arrow Articles by Nakai, K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?