Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (77K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (75)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Maglott, D. R.
Right arrow Articles by Pruitt, K. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Maglott, D. R.
Right arrow Articles by Pruitt, K. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2000, Vol. 28, No. 1 126-128
© 2000 Oxford University Press

NCBI’s LocusLink and RefSeq

Donna R. Maglott, Kenneth S. Katz, Hugues Sicotte and Kim D. Pruitt*

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA

Received September 2, 1999; Revised and Accepted October 4, 1999.


    ABSTRACT
 TOP
 ABSTRACT
 BACKGROUND
 SCOPE
 ACCESS
 SEARCHING
 MAINTENANCE
 CONTACT
 REFERENCES
 
The NCBI has introduced two new web resources—LocusLink and RefSeq—that facilitate retrieval of gene-based information and provide reference sequence standards. These resources are designed to provide a non-redundant view of current knowledge about human genes, transcripts and proteins. Additional information about these resources is available on the LocusLink web site at http://www.ncbi.nlm.nih.gov/LocusLink/


    BACKGROUND
 TOP
 ABSTRACT
 BACKGROUND
 SCOPE
 ACCESS
 SEARCHING
 MAINTENANCE
 CONTACT
 REFERENCES
 
The LocusLink and RefSeq databases were initiated to address data-access problems resulting from significant increases in both sequence data and the number of web sites relating information about genes. For example, it is increasingly difficult to identify unambiguously which sequence—of the many publicly available—is an appropriate, complete representative of a given mRNA or protein. Inversely, given an mRNA or protein sequence, it can also be a challenge to determine the official name or symbol for the gene from which the sequence was derived. And once a gene symbol or name is known, identifying other web resources that include information about that gene of interest may be very time-consuming. In its role as a web directory, LocusLink provides a single point-of-access to a variety of gene-specific information sources including web resources and RefSeq. RefSeq provides a non-redundant data set of reference sequences representing transcripts and proteins of known genes. RefSeq records include links to LocusLink, thereby facilitating making connections among sequence data, gene names and related biological information. The LocusLink and RefSeq resources establish reference sequences and stable database identifiers (LocusID) that can be used in variation, mutation and expression analyses.


    SCOPE
 TOP
 ABSTRACT
 BACKGROUND
 SCOPE
 ACCESS
 SEARCHING
 MAINTENANCE
 CONTACT
 REFERENCES
 
LocusLink
LocusLink offers a simple query interface to retrieve information about human genes and some non-gene loci. It supports text-based queries by using official nomenclature provided through collaboration with the Human Gene Nomenclature Committee (HGNC; http://www.gene.ucl.ac.uk/nomenclature/ ) (1), as well as cytogenetic locations, aliases and historical names for both a gene and its products. LocusLink provides direct connections to related information available from several resources at NCBI (Table 1) as well as to external web sites including the Genome Database (GDB; http://gdbwww.gdb.org/ ), the Human Gene Mutation Database (HGMD; http://www.uwcm.ac.uk/uwcm/mg/hgmd0.html ) (2), GeneCard (http://bioinfo.weizmann.ac.il/cards/ ), GeneClinics (http://www.geneclinics.org/ ), and locus- or gene family-specific web sites. Some of the links to NCBI resources listed in Table 1 are represented by icons that, when displayed, give an immediate indication that additional information is indeed available. The goal of the PubMed and GenBank/GenPept (3) links is not to be comprehensive, but to establish sufficient connections to facilitate information retrieval via NCBI’s ENTREZ (4) ‘related sequences’ or ‘related publications’ links or through BLAST (5). LocusLink also provides a unique stable identifier for each locus (LocusID).


View this table:
[in this window]
[in a new window]
 
Table 1. LocusLink connections to resources at NCBI
 
RefSeq
Although the goal of RefSeq in general is to provide reference sequences representing chromosomes, transcripts and proteins, discussion here is restricted to the subset of human mRNAs and proteins. A RefSeq record is made for an mRNA if the function of the gene product has been studied, and if the sequence of the complete coding region is known. Separate RefSeq records are made for experimentally supported alternate transcripts and their products. The sequence presented in a RefSeq record is usually derived from available GenBank records, although additional information is at times added from the literature or from communications with the research community. RefSeq records are provided in one of two states, either provisional or reviewed. Records initially released as provisional include much of the annotation from the GenBank record used as the source, but incorporate gene and protein names, PubMed links, summary text, and map and chromosome data from LocusLink when available (Table 2). Provisional records are subjected to a manual curation and review process, with the reviewed record being the end product. The reviewed record might differ from the original provisional record by including: (i) more extensive 5' and 3' untranslated regions derived from other GenBank records or the literature, (ii) additional mRNA and/or protein features, (iii) more publications and (iv) a summary text describing the gene. Table 2 lists additional annotation that may be added to provisional and reviewed RefSeq records. RefSeq records can be distinguished from GenBank records by the inclusion of a REFSEQ statement in a COMMENT field, and by the unique format of the accession number. The first three characters of the RefSeq mRNA and protein accession numbers are NM_ and NP_, respectively, followed by six numerals (e.g. NM_000280, NP_000337).


View this table:
[in this window]
[in a new window]
 
Table 2. Enhanced annotation in RefSeq nucleotide records
 

    ACCESS
 TOP
 ABSTRACT
 BACKGROUND
 SCOPE
 ACCESS
 SEARCHING
 MAINTENANCE
 CONTACT
 REFERENCES
 
RefSeq records can be retrieved by text word queries (gene or protein names or symbols, accession numbers, etc.) or by sequence homology. LocusLink (see Table 3 for URLs) and ENTREZ both support accessing RefSeq records by text. BLAST-based sequence queries must be done against the nucleotide or protein nr databases. The RefSeq records in a BLAST query result can be readily identified by the ‘ref’ prefix and the distinct accession number format described above. More query details and examples are provided in the LocusLink and RefSeq help and FAQ pages available from the LocusLink home page.


View this table:
[in this window]
[in a new window]
 
Table 3. LocusLink and RefSeq URLs
 
LocusLink and RefSeq records are also freely available on the NCBI FTP site (see Table 3). Note that RefSeq records are not in GenBank and must be downloaded separately.


    SEARCHING
 TOP
 ABSTRACT
 BACKGROUND
 SCOPE
 ACCESS
 SEARCHING
 MAINTENANCE
 CONTACT
 REFERENCES
 
Comprehensive descriptions of query strategies and navigation from LocusLink and RefSeq are provided from the LocusLink home page. Please note there are multiple sites within NCBI that include links to LocusLink and RefSeq by specific identifiers. These include Online Mendelian Inheritance in Man (OMIM; http://www.ncbi.nlm.nih.gov/omim/ ), UniGene (http://www.ncbi.nlm.nih.gov/UniGene/ ), GeneMap’99 (http://www.ncbi.nlm.nih.gov/genemap/ ) and dbSNP (http://www.ncbi.nlm.nih.gov/SNP/ ) (6).


    MAINTENANCE
 TOP
 ABSTRACT
 BACKGROUND
 SCOPE
 ACCESS
 SEARCHING
 MAINTENANCE
 CONTACT
 REFERENCES
 
LocusLink and RefSeq records are created and maintained by an ongoing process as described by Pruitt et al. (7) and on the LocusLink web site. The LocusLink web pages are currently refreshed weekly. RefSeq records may be modified at any time based either on text changes (nomenclature), or by replacing a provisional record with a reviewed one (maintaining the same accession number, but changing the version number and sequence ID numbers if the sequence data has changed).


    CONTACT
 TOP
 ABSTRACT
 BACKGROUND
 SCOPE
 ACCESS
 SEARCHING
 MAINTENANCE
 CONTACT
 REFERENCES
 
Questions, comments and suggestions can be emailed to info@ncbi.nlm.nih.gov . We welcome collaborations with and contributions from the research community.


    FOOTNOTES
 
* To whom correspondence should be addressed. Tel: +1 301 435 5898; Fax: +1 301 480 9241; Email: pruitt@ncbi.nlm.nih.gov Back


    REFERENCES
 TOP
 ABSTRACT
 BACKGROUND
 SCOPE
 ACCESS
 SEARCHING
 MAINTENANCE
 CONTACT
 REFERENCES
 

    1 White,J.A., McAlpine,P.J., Antonarakis,S., Cann,H., Eppig,J.T., Frazer,K., Frezal,J., Lancet,D., Nahmias,J., Pearson,P., Peters,J., Scott,A., Scott,H., Spurr,N., Talbot,C.,Jr and Povey,S. (1978) Genomics, 45, 468–471.

    2Cooper,D.N., Ball,E.V. and Krawczak,M. (1998) Nucleic Acids Res., 26, 285–287. [Abstract/Free Full Text]

    3Benson,D.A. (1999) Nucleic Acids Res., 27, 12–17. Updated article in this issue: Nucleic Acids Res. (2000), 28, 15–18.[Abstract/Free Full Text]

    4 Schuler,G.D., Epstein,J.A., Ohkawa,H. and Kans,J.A. (1996) Methods Enzymol., 266, 141–162.[Web of Science][Medline]

    5 Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389–3402.[Abstract/Free Full Text]

    6 Sherry,S.T. (2000) Nucleic Acids Res., 28, 352–355.[Abstract/Free Full Text]

    7 Pruitt,K.D., Katz,K.S., Sicotte,H. and Maglott,D.R. (2000) Trends Genet., in press.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
K. D. Yokoyama, U. Ohler, and G. A. Wray
Measuring spatial preferences at fine-scale resolution identifies known and novel cis-regulatory element candidates and functional motif-pair relationships
Nucleic Acids Res., July 1, 2009; 37(13): e92 - e92.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Tharakaraman, O. Bodenreider, D. Landsman, J. L. Spouge, and L. Marino-Ramirez
The biological function of some human transcription factor binding motifs varies with position relative to the transcription start site
Nucleic Acids Res., May 1, 2008; 36(8): 2777 - 2786.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Moreno-Hagelsieb and K. Latimer
Choosing BLAST options for better detection of orthologs as reciprocal best hits
Bioinformatics, February 1, 2008; 24(3): 319 - 324.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. C. Janga, W. F. Lamboy, A. M. Huerta, and G. Moreno-Hagelsieb
The distinctive signatures of promoter regions and operon junctions across prokaryotes
Nucleic Acids Res., September 1, 2006; 34(14): 3980 - 3987.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
M. F. Rudd, R. D. Williams, E. L. Webb, S. Schmidt, G. S. Sellick, and R. S. Houlston
The Predicted Impact of Coding Single Nucleotide Polymorphisms Database
Cancer Epidemiol. Biomarkers Prev., November 1, 2005; 14(11): 2598 - 2604.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
B. Crise, Y. Li, C. Yuan, D. R. Morcock, D. Whitby, D. J. Munroe, L. O. Arthur, and X. Wu
Simian Immunodeficiency Virus Integration Preference Is Similar to That of Human Immunodeficiency Virus Type 1
J. Virol., October 1, 2005; 79(19): 12199 - 12204.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. Bekaert, H. Richard, B. Prum, and J.-P. Rousset
Identification of programmed translational -1 frameshifting sites in the genome of Saccharomyces cerevisiae
Genome Res., October 1, 2005; 15(10): 1411 - 1420.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Karchin, M. Diekhans, L. Kelly, D. J. Thomas, U. Pieper, N. Eswar, D. Haussler, and A. Sali
LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources
Bioinformatics, June 15, 2005; 21(12): 2814 - 2820.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Tao, C. Friedman, and Y. A. Lussier
Visualizing information across multidimensional post-genomic structured and textual databases
Bioinformatics, April 15, 2005; 21(8): 1659 - 1667.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. L. Winsor, R. Lo, S. J. H. Sui, K. S.E. Ung, S. Huang, D. Cheng, W.-K. H. Ching, R. E. W. Hancock, and F. S. L. Brinkman
Pseudomonas aeruginosa Genome Database and PseudoCAP: facilitating community-based, continually updated, genome annotation
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D338 - D343.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
G. Marsischky and J. LaBaer
Many Paths to Many Clones: A Comparative Look at High-Throughput Cloning Methods
Genome Res., October 1, 2004; 14(10b): 2020 - 2028.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
P. C. FitzGerald, A. Shlyakhtenko, A. A. Mir, and C. Vinson
Clustering of DNA Sequences in Human Promoters
Genome Res., August 1, 2004; 14(8): 1562 - 1574.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Physiol. Endocrinol. Metab.Home page
K. S. Nair, A. Jaleel, Y. W. Asmann, K. R. Short, and S. Raghavakaimal
Proteomic research: potential opportunities for clinical and physiological investigators
Am J Physiol Endocrinol Metab, June 1, 2004; 286(6): E863 - E874.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
S. B. Montgomery, T. Astakhova, M. Bilenky, E. Birney, T. Fu, M. Hassel, C. Melsopp, M. Rak, A. G. Robertson, M. Sleumer, et al.
Sockeye: A 3D Environment for Comparative Genomics
Genome Res., May 1, 2004; 14(5): 956 - 962.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Chong, G. Zhang, and V. B. Bajic
FIE2: a program for the extraction of genomic DNA sequences around the start and translation initiation site of human genes
Nucleic Acids Res., July 1, 2003; 31(13): 3546 - 3553.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
R. Elkon, C. Linhart, R. Sharan, R. Shamir, and Y. Shiloh
Genome-Wide In Silico Identification of Transcriptional Regulators Controlling the Cell Cycle in Human Cells
Genome Res., May 1, 2003; 13(5): 773 - 780.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Unneberg, A. Wennborg, and M. Larsson
Transcript identification by analysis of short sequence tags--influence of tag length, restriction site and transcript database
Nucleic Acids Res., April 15, 2003; 31(8): 2217 - 2226.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
Z. Kan, D. States, and W. Gish
Selecting for Functional Alternative Splices in ESTs
Genome Res., December 1, 2002; 12(12): 1837 - 1845.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
W. J. Kent, C. W. Sugnet, T. S. Furey, K. M. Roskin, T. H. Pringle, A. M. Zahler, and a. D. Haussler
The Human Genome Browser at UCSC
Genome Res., June 1, 2002; 12(6): 996 - 1006.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
R. J. Mural, M. D. Adams, E. W. Myers, H. O. Smith, G. L. G. Miklos, R. Wides, A. Halpern, P. W. Li, G. G. Sutton, J. Nadeau, et al.
A Comparison of Whole-Genome Shotgun-Derived Mouse Chromosome 16 and the Human Genome
Science, May 31, 2002; 296(5573): 1661 - 1671.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
H. Yu, G. Hripcsak, and C. Friedman
Mapping Abbreviations to Full Forms in Biomedical Articles
J. Am. Med. Inform. Assoc., May 1, 2002; 9(3): 262 - 272.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
R.-F. Yeh, L. P. Lim, and C. B. Burge
Computational Inference of Homologous Gene Structures in the Human Genome
Genome Res., May 1, 2001; 11(5): 803 - 816.
[Abstract] [Full Text]


Home page
Genome ResHome page
Z. Kan, E. C. Rouchka, W. R. Gish, and D. J. States
Gene Structure Prediction and Alternative Splicing Analysis Using Genomically Aligned ESTs
Genome Res., May 1, 2001; 11(5): 889 - 900.
[Abstract] [Full Text]


Home page
Nucleic Acids ResHome page
M. D. Wilson, C. Riemer, D. W. Martindale, P. Schnupf, A. P. Boright, T. L. Cheung, D. M. Hardy, S. Schwartz, S. W. Scherer, L.-C. Tsui, et al.
Comparative analysis of the gene-dense ACHE/TFR2 region on human chromosome 7q22 with the orthologous region on mouse chromosome 5
Nucleic Acids Res., March 15, 2001; 29(6): 1352 - 1365.
[Abstract] [Full Text] [PDF]


Home page
J. Physiol.Home page
J D Matthew, A S Khromov, M J McDuffie, A V Somlyo, A P Somlyo, S Taniguchi, and K Takahashi
Contractile properties and proteins of smooth muscles of a calponin knockout mouse
J. Physiol., December 15, 2000; 529(3): 811 - 824.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. L. Wheeler, C. Chappey, A. E. Lash, D. D. Leipe, T. L. Madden, G. D. Schuler, T. A. Tatusova, and B. A. Rapp
Database resources of the National Center for Biotechnology Information
Nucleic Acids Res., January 1, 2000; 28(1): 10 - 14.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
M.-L. Samson
Drosophila Arginase Is Produced from a Nonvital Gene That Contains the elav Locus within Its Third Intron
J. Biol. Chem., September 29, 2000; 275(40): 31107 - 31114.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (77K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (75)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Maglott, D. R.
Right arrow Articles by Pruitt, K. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Maglott, D. R.
Right arrow Articles by Pruitt, K. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?