Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (45K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Wain, H. M.
Right arrow Articles by Povey, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wain, H. M.
Right arrow Articles by Povey, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2004, Vol. 32, Database issue D255-D257
© 2004 Oxford University Press

Genew: the Human Gene Nomenclature Database, 2004 updates

Hester M. Wain*, Michael J. Lush, Fabrice Ducluzeau, Varsha K. Khodiyar and Sue Povey

HUGO Gene Nomenclature Committee (HGNC), Department of Biology, University College London, Wolfson House, 4 Stephenson Way, London NW1 2HE, UK

*To whom correspondence should be addressed. Tel: +44 20 7679 5027; Fax: +44 20 7387 3496; Email: nome{at}galton.ucl.ac.uk

Received September 15, 2003; Revised and Accepted September 30, 2003


    ABSTRACT
 TOP
 ABSTRACT
 OVERVIEW
 IMPROVEMENTS SINCE 2002
 GENEW UPGRADE
 IMPLEMENTATION
 CITATION
 REFERENCES
 
Genew, the Human Gene Nomenclature Database http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/searchgenes.pl is the only resource that provides data for all human genes that have approved symbols. It is managed by the HUGO Gene Nomenclature Committee (HGNC) as a confidential database, containing over 22 000 records, 75% of which are represented online by a publicly searchable text file. Since 2002, there have been significant improvements to the Genew search engine. Additionally we have increased our capacity to analyse confidential sequence data, which has enabled us to manage the large numbers of gene symbol requests that we receive from the chromosome sequencing consortia.


    OVERVIEW
 TOP
 ABSTRACT
 OVERVIEW
 IMPROVEMENTS SINCE 2002
 GENEW UPGRADE
 IMPLEMENTATION
 CITATION
 REFERENCES
 
The Genew database (1) is the primary resource for approved gene symbols for all other human genetic databases. We exchange information with many databases and organizations throughout the world to update new gene symbols and encourage their use.


    IMPROVEMENTS SINCE 2002
 TOP
 ABSTRACT
 OVERVIEW
 IMPROVEMENTS SINCE 2002
 GENEW UPGRADE
 IMPLEMENTATION
 CITATION
 REFERENCES
 
New search engine
The new version of the Genew search engine was made available in 2002. This can be found at the same URL: http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/searchgenes.pl and now provides direct links from the search results to individually curated gene records. Both quick and advanced search options are available, with 93% of users opting for the quick gene search option, indicating that this resolves most user queries. However, the advanced search options can be very useful in resolving more complex queries. We have significantly increased the variety of search terms, so now any term within the data file searchdata.txt can be used. This file is available directly online (http://www.gene.ucl.ac.uk/public-files/nomen/searchdata.txt) and by FTP (http://www.gene. ucl.ac.uk/nomenclature/code/ftpaccess.html).

Each online gene record contains 23 fields, with 14 links to other relevant resources including: Ensembl (2), GENATLAS (3), GeneCards (4), GeneClinics/GeneTests (http://www. genetests.org), the international ImMunoGeneTics database® (IMGT) (5), LocusLink (6), MGD (7), OMIM (8), Ref_Seq (6) and Swiss-Prot (9).

Each gene record is available by querying either the approved gene symbol or the HGNC ID number, thus enabling other databases to link directly to the Genew record, even if the symbol changes. For example the gene record for CFTR, using the approved symbol, is at URL: http://www.gene. ucl.ac.uk/cgi-bin/nomenclature/get_data.pl?match=CFTR and using the HGNC ID number is at URL: http://www.gene. ucl.ac.uk/cgi-bin/nomenclature/get_data.pl?hgnc_id=1884.

The new Genew search engine has received a total of 422 113 hits (since July 2002), with an average of 31 038 hits per month. Table 1 gives an indication of how many of these hits are followed by searches of the database.


View this table:
[in this window]
[in a new window]
 
Table 1. Use of the Genew search engine (July 2002–August 2003)
 
We also monitor the top 20 search terms used, as this assists us in developing both a more user-friendly search engine and a better understanding of commonly used (but possibly not approved) gene symbols. Table 2 shows the total number of searches for the top 20 search terms and their approved symbols (which are the same in all bar one case: TP53 is the approved symbol for ‘p53’).


View this table:
[in this window]
[in a new window]
 
Table 2. Top 20 search terms used in the Genew search engine
 
Non-human orthologues
With increased requests for gene symbols in other species, we have added a new gene status, ‘Approved Non-Human’. This currently includes 98 entries that we have approved in order to maintain the orthologous symbol in the human gene family series. It is quite likely that most of these genes will ultimately be found in the human genome. Each ‘Approved Non-Human’ gene symbol has links to the appropriate non-human sequence accession ID where possible. The orthologous species currently include: mouse, cow, rat, African clawed toad, pig, zebrafish and dog.

LocusLink updates
In order to update correctly the LocusLink entries with approved gene symbols we have added a new field designated

‘Locus Type’. This includes designations such as:

(i) gene with no protein product;

(ii) model, supported by EST alignments;

(iii) phenotype only;

(iv) pseudogene;

(v) RNA, ribosomal.

Genew updates are exported twice a week as the text file: http://www.gene.ucl.ac.uk/public-files/nomen/ncbi2.txt, which is automatically imported into the LocusLink database.

Confidential gene records
Unnamed genes are placed into the confidential section of Genew (known previously as ‘pending’). This includes those genes that have been submitted by authors and/or journals for symbol approval prior to publication. In addition, we have further increased this resource with unnamed genes from two major public data sets: the ‘Interim’ human genes from LocusLink and the interim mouse genes from MGD which are updated once a week. There are now just over 3000 unnamed gene records awaiting approval.

Downloads/FTP
A variety of files is available online or via FTP from: http://www.gene.ucl.ac.uk/public-files/nomen/. These include chromosome-specific files with any nomenclature changes highlighted.


    GENEW UPGRADE
 TOP
 ABSTRACT
 OVERVIEW
 IMPROVEMENTS SINCE 2002
 GENEW UPGRADE
 IMPLEMENTATION
 CITATION
 REFERENCES
 
We have been working towards transferring Genew to PostgreSQL and creating a more dynamic web interface. However, the large numbers of symbol requests from chromosome sequencing consortia have altered our priorities, so in the last year we have focused our bioinformatics resources on a more comprehensive sequence database termed LBlast.

LBlast
Our LBlast database system comprises a set of Perl scripts that provide active maintenance of sequence annotation and automatic sequence importation into the LBlast database, thus reflecting sequence additions to the Genew database on an ongoing basis from three diverse sources of confidential sequence data:

(i) raw sequence data from Genew records (4608 DNA and 1660 protein sequences);

(ii) sequence accession numbers from Genew records (28 771 sequences);

(iii) raw sequence data from Editors and chromosome projects (24 110 sequences).

Each gene sequence is now tracked via a unique HGNC sequence accession number (HSeq), which is added to the confidential gene record. The LBlast system has been set up in such a way that any sequence used to search the database is immediately assigned an HSeq ID and added to user_contrib, which consists of sequences that have been searched against the database in the previous 4 weeks. Thus, the submitted sequences are added to the LBlast database before the BLAST (10) search is run, allowing duplicate submissions to be identified immediately.

Sequence analysis
All sequences submitted to the HGNC are analysed initially using NCBI’s BLAST. This searches our confidential sequences, sequence data imported from LocusLink, the non-redundant DNA and protein sequences and patent sequences [from GenBank (10) and EMBL (11)]. In addition, all sequences are also analysed for the presence of domains and motifs via InterProScan (12). All InterProScan and BLAST results are stored permanently in the database.

The LBlast sequence data are managed in a PostgreSQL database (http://www.postgresql.org/), via a collection of Perl scripts (http://www.perl.com/) using BioPerl (http://bioperl. org/) with a PHP interface (http://www.php.net). This has been developed with the intention of adding the Genew interface at a later date.

Our capacity to process sequence data increased significantly in 2003 with the development and installation of our Beowulf Cluster. The cluster contains 16 Athlon MP 2000+ CPUs, 32 Gb of RAM and 520 Gb of disk space, and enables us to process 500 LBlast searches, or 37 InterProScans, an hour. Previously, our Sun E250 could only manage one or two LBlast searches an hour and was unable to complete InterProScans in a reasonable time. Details of the cluster construction will be available from our website http://www.gene.ucl.ac.uk/nomenclature/ by January 2004.


    IMPLEMENTATION
 TOP
 ABSTRACT
 OVERVIEW
 IMPROVEMENTS SINCE 2002
 GENEW UPGRADE
 IMPLEMENTATION
 CITATION
 REFERENCES
 
Genew is currently implemented in the Microsoft Access 97 relational database management system. The database consists of 13 tables containing over 170 fields and 22 000 gene records.

The Genew search engine, http://www.gene.ucl.ac.uk/ cgi-bin/nomenclature/searchgenes.pl, is based on a Perl front-end querying a PostgreSQL database, derived from text files exported from the off-line database.


    CITATION
 TOP
 ABSTRACT
 OVERVIEW
 IMPROVEMENTS SINCE 2002
 GENEW UPGRADE
 IMPLEMENTATION
 CITATION
 REFERENCES
 
Authors are requested to cite this article and the database in the following format: ‘Genew, HUGO Gene Nomenclature Committee (HGNC), Department of Biology, University College London, Wolfson House, 4 Stephenson Way, London NW1 2HE, UK (URL: http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/searchgenes.pl)’ [Include month and year in which you retrieved the data cited.]


    ACKNOWLEDGEMENTS
 
Many thanks to the HGNC editors Drs Elspeth Bruford, Ruth Lovering, Mathew Wright and Connie Talbot Jr whose accurate curation and attention to detail ensure the validity of the gene records. The HGNC is supported by NIH contract N01-LM-9-3533 and by the UK Medical Research Council.


    REFERENCES
 TOP
 ABSTRACT
 OVERVIEW
 IMPROVEMENTS SINCE 2002
 GENEW UPGRADE
 IMPLEMENTATION
 CITATION
 REFERENCES
 

  1. Wain,H.M., Lush,M., Ducluzeau,F. and Povey,S. (2002) Genew: The Human Nomenclature Database. Nucleic Acids Res., 30, 169–171.[Abstract/Free Full Text]

  2. Clamp,M., Andrews,D., Barker,D., Bevan,P., Cameron,G., Chen,Y., Clark,L., Cox,T., Cuff,J., Curwen,V. et al. (2003) Ensembl 2002: accommodating comparative genomics. Nucleic Acids Res., 31, 38–42.[Abstract/Free Full Text]

  3. Frezal,J. (1998) Genatlas database, genes and development defects. C. R. Acad. Sci. III, 321, 805–817.[Medline]

  4. Safran,M., Chalifa-Caspi,V., Shmueli,O., Lapidot,M., Rosen,N., Shmoish,M., Adato,A., Peter,I. and Lancet,D. (2003) Human gene-centric databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE. Nucleic Acids Res., 31, 142–146.[Abstract/Free Full Text]

  5. Lefranc,M.-P. (2003) IMGT, the international ImMunoGeneTics database. Nucleic Acids Res., 31, 307–310.[Abstract/Free Full Text]

  6. Pruitt,K.D. and. Maglott,D.R. (2001) RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res., 29, 137–140.[Abstract/Free Full Text]

  7. Blake,J.A., Richardson,J.E., Bult,C.J., Kadin,J.A., Eppig,J.T.; Mouse Genome Database Group (2003) MGD: the Mouse Genome Database. Nucleic Acids Res., 31, 193–195.[Abstract/Free Full Text]

  8. Wheeler,D.L., Church,D.M., Federhen, S., Lash,A.E., Madden,T.L., Pontius,J.U., Schuler,G.D., Schriml,L.M., Sequeira,E., Tatusova,T.A. et al. (2003) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 31, 28–33.[Abstract/Free Full Text]

  9. Boeckmann,B., Bairoch,A., Apweiler,R., Blatter,M.-C., Estreicher,A., Gasteiger,E., Martin,M.J., Michoud,K., O’Donovan,C., Phan,I. et al. (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res., 31, 365–370.[Abstract/Free Full Text]

  10. Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403–410.[CrossRef][Web of Science][Medline]

  11. Stoesser,G., Baker,W., van den Broek,A., Garcia-Pastor,M., Kanz,C., Kulikova,T., Leinonen,R., Lin,Q., Lombard,V., Lopez,R. et al. (2003) The EMBL Nucleotide Sequence Database: major new developments. Nucleic Acids Res., 31, 17–22.[Abstract/Free Full Text]

  12. Mulder,N.J., Apweiler,R., Attwood,T.K., Bairoch,A., Bateman,A., Binns,D., Biswas,M., Bradley,P., Bork,P., Bucher,P. et al. (2003) The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res., 31, 315–318.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
M. A. A. Castro, R. J. S. Dalmolin, J. C. F. Moreira, J. C. M. Mombach, and R. M. C. de Almeida
Evolutionary origins of human apoptosis and genome-stability gene networks
Nucleic Acids Res., November 1, 2008; 36(19): 6269 - 6283.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. A. A. Castro, J. C. M. Mombach, R. M. C. de Almeida, and J. C. F. Moreira
Impaired expression of NER gene network in sporadic solid tumors
Nucleic Acids Res., March 19, 2007; 35(6): 1859 - 1867.
[Abstract] [Full Text] [PDF]


Home page
J. Mol. Diagn.Home page
S. Ogino, M. L. Gulley, J. T. den Dunnen, R. B. Wilson, and and the Association for Molecular Pathology Traini
Standard Mutation Nomenclature in Molecular Diagnostics: Practical and Educational Challenges
J. Mol. Diagn., February 1, 2007; 9(1): 1 - 6.
[Abstract] [Full Text] [PDF]


Home page
FASEB J.Home page
D. Motzkus, S. Schulz-Maronde, A. Heitland, A. Schulz, W.-G. Forssmann, M. Jubner, and E. Maronde
The novel {beta}-defensin DEFB123 prevents lipopolysaccharide-mediated effects in vitro and in vivo
FASEB J, August 1, 2006; 20(10): 1701 - 1702.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
F. Hsu, W. J. Kent, H. Clawson, R. M. Kuhn, M. Diekhans, and D. Haussler
The UCSC Known Genes
Bioinformatics, May 1, 2006; 22(9): 1036 - 1046.
[Abstract] [Full Text] [PDF]


Home page
J. Immunol.Home page
Y. Ohta, W. Goetz, M. Z. Hossain, M. Nonaka, and M. F. Flajnik
Ancestral Organization of the MHC Revealed in the Amphibian Xenopus
J. Immunol., March 15, 2006; 176(6): 3674 - 3685.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. Matys, O. V. Kel-Margoulis, E. Fricke, I. Liebich, S. Land, A. Barre-Dirrie, I. Reuter, D. Chekmenev, M. Krull, K. Hornischer, et al.
TRANSFAC(R) and its module TRANSCompel(R): transcriptional gene regulation in eukaryotes
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D108 - D110.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. H. Saier Jr, C. V. Tran, and R. D. Barabote
TCDB: the Transporter Classification Database for membrane transport protein analyses and information
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D181 - D186.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. A. Eyre, F. Ducluzeau, T. P. Sneddon, S. Povey, E. A. Bruford, and M. J. Lush
The HUGO Gene Nomenclature Database, 2006 updates
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D319 - D321.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Sims, B. Bursteinas, Q. Gao, M. Zvelebil, and B. Baum
FLIGHT: database and tools for the integration and cross-correlation of large-scale RNAi phenotypic datasets
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D479 - D483.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Ng, B. Bursteinas, Q. Gao, E. Mollison, and M. Zvelebil
pSTIING: a 'systems' approach towards integrating signalling pathways, interaction and transcriptional regulatory networks in inflammation and cancer
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D527 - D534.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Krull, S. Pistor, N. Voss, A. Kel, I. Reuter, D. Kronenberg, H. Michael, K. Schwarzer, A. Potapov, C. Choi, et al.
TRANSPATH(R): an information resource for storing and visualizing signaling pathways and their pathological aberrations
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D546 - D551.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Li, A. Coghlan, J. Ruan, L. J. Coin, J.-K. Heriche, L. Osmotherly, R. Li, T. Liu, Z. Zhang, L. Bolund, et al.
TreeFam: a curated database of phylogenetic trees of animal gene families
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D572 - D580.
[Abstract] [Full Text] [PDF]


Home page
Biol. Reprod.Home page
J. Pudney, A. J. Quayle, and D. J. Anderson
Immunological Microenvironments in the Human Vagina and Cervix: Mediators of Cellular Immunity Are Concentrated in the Cervical Transformation Zone
Biol Reprod, December 1, 2005; 73(6): 1253 - 1263.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Respir. Cell Mol. Bio.Home page
J. Xu, J. Tian, and S. D. Shapiro
Normal Lung Development in RAIG1-Deficient Mice Despite Unique Lung Epithelium-Specific Expression
Am. J. Respir. Cell Mol. Biol., May 1, 2005; 32(5): 381 - 387.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
S. Orchard, H. Hermjakob, and R. Apweiler
Annotating the Human Proteome
Mol. Cell. Proteomics, April 1, 2005; 4(4): 435 - 440.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Döhr, A. Klingenhoff, H. Maier, M. H. de Angelis, T. Werner, and R. Schneider
Linking disease-associated genes to regulatory networks via promoter organization
Nucleic Acids Res., February 8, 2005; 33(3): 864 - 872.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Physiol. Cell Physiol.Home page
C. L. Brett, M. Donowitz, and R. Rao
Evolutionary origins of eukaryotic sodium/proton exchangers
Am J Physiol Cell Physiol, February 1, 2005; 288(2): C223 - C239.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Mignone, G. Grillo, F. Licciulli, M. Iacono, S. Liuni, P. J. Kersey, J. Duarte, C. Saccone, and G. Pesole
UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D141 - D146.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. D. Wren, J. T. Chang, J. Pustejovsky, E. Adar, H. R. Garner, and R. B. Altman
Biomedical term mapping databases
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D289 - D293.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. L. Ashurst, C.-K. Chen, J. G. R. Gilbert, K. Jekosch, S. Keenan, P. Meidl, S. M. Searle, J. Stalker, R. Storey, S. Trevanion, et al.
The Vertebrate Genome Annotation (Vega) database
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D459 - D465.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
E. Birney, T. D. Andrews, P. Bevan, M. Caccamo, Y. Chen, L. Clarke, G. Coates, J. Cuff, V. Curwen, T. Cutts, et al.
An Overview of Ensembl
Genome Res., May 1, 2004; 14(5): 925 - 928.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (45K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Wain, H. M.
Right arrow Articles by Povey, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wain, H. M.
Right arrow Articles by Povey, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?