Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (37K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 1996 Oxford University Press 53-56

Footnote

FlyBase: the Drosophila database

FlyBase: the Drosophila database The FlyBase Consortium*

FlyBase, Biological Laboratories, 16 Divinity Avenue, Cambridge , MA 02138, USA

Received October 4, 1995 ; Accepted October 5, 1995

ABSTRACT

FlyBase is a database of genetic and molecular data concerning Drosophila . FlyBase is maintained as a relational database (in Sybase). The scope of FlyBase includes: genes, alleles (and phenotypes), aberrations, pointers to sequence data, clones, stock lists, Drosophila workers and bibliographic references. FlyBase is also available on CD-ROM for Macintosh systems ( Encyclopaedia of Drosophila ).

BACKGROUND

Drosophila melanogaster is one of the most studied eukaryotic organisms. Introduced to `modern' biology in the early years of this century, research with D.melanogaster has been at the forefront of most areas of biology, from genetics to ecology, from neurobiology to evolution. Drosophila geneticists have been well served by a series of `catalogs' of mutations, the first of which was published in 1925, and by regular publication (again, dating from 1925) of bibliographies of the Drosophila literature. The last `conventional' catalog of the genes and mutations of D.melanogaster was published in 1992 ( 3 ), although data collection ceased in late 1989.

Before Lindsley and Zimm's catalog was published, discussions in the community had led to the conclusion that an `electronic' database was essential, if the rapidly increasing knowledge of Drosophila was to be available in a convenient form. As a consequence, beginning in October 1992, the National Center for Human Genome Research of the NIH has funded the FlyBase project with the objective of designing, building and releasing a database of genetic and molecular information concerning this insect. FlyBase has also received support from the Medical Research Council, London.

SCOPE

The `core' of FlyBase is data concerning the genes and mutations of Drosophila :

Gene name: gene symbol; Synonym(s) name; Synonym(s) symbol; Genetic map position; Polytene chromosome map position; Nature of gene product(s); Molecular data; Gene expression pattern data; Similar genes in other organisms; Database cross-references.

Allele(s) name: Allele(s) symbol; Allele synonym(s) name; Allele synonym(s) symbol; Origin of allele; Phenotypic information; Molecular data.

Clones (cosmids, P1s, YACS)

P-elements

Transposon constructs and their components

Chromosome aberrations

Bibliographic references

Stock lists

People

Allied data

FlyBase identifier numbers.

In their present form these data are a mixture of information in a highly structured, parseable form and of free text. All of the data are available as flat (ASCII) files, the majority being output of selected data sets from the relational database implementation of FlyBase.

The taxonomic scope of FlyBase is the family Drosophilidae. Genetic data on species other than D.melanogaster are few, although mutant catalogs for D.buzzatii (from J.S.F. Barker) and D.ananassae (from Y.N. Tobari) have been incorporated.

FlyBase and the Berkeley Drosophila Genome Project jointly produce the Encyclopaedia of Drosophila. This incorporates their data in ACeDB format and is available for both Unix and Macintosh systems. The Mac version is published as a CD-ROM (see below).

FlyBase identifier numbers

All genes, alleles, aberrations, bibliographic records and species in FlyBase have unique identifiers that allow them to be referenced both within FlyBase and externally. FlyBase identifiers are of the form: FBxxnnnnnnn, where xx is a two-letter code signifying the type of identifier and nnnnnnn is a 7 digit number padded with leading zeros. Identifier codes now used are: FBgn gene identifier (eg FBgn0001234) FBal allele identifier FBab aberration identifier FBrf bibliographic reference identifier FBsp species identifier FBmc construct identifier FBba balancer FBtp engineered transposon.

Genes, alleles and aberrations

In October 1995, FlyBase included information on >9000 genes, nearly 26 000 alleles and >11 000 chromosome aberrations. Except for historical data, inherited from Lindsley and Zimm ( 3 ) for example, all data are attributed to a single publication (including personal communications to FlyBase; these are archived and made accessible to users). The data are a mixture of free text and controlled syntax. FlyBase uses a standard controlled vocabulary of terms to describe, for example, mutagens and anatomical parts of Drosophila .

Nomenclature and synonyms

The genetic nomenclature of D.melanogaster is chaotic (though perhaps not in the technical sense of this word). FlyBase has written a document on nomenclatural standards for the community (flybase/nomenclature/nomenclature.txt; FlyBase, 1995). The synonymy of Drosophila gene, allele and aberration names is very extensive. Valid gene, allele and aberration names are accessible from a file of >31 000 synonyms.

Map data

All map data are stored in FlyBase in a common form, regardless of whether these data are genetic, cytogenetic or molecular. This allows FlyBase to output integrated maps in a variety of formats. Tools have been written to output information from the relational tables in a variety of forms, including graphical and tabular. For example, the CytoSearch tool on the FlyBase World Wide Web (WWW) server (see below) allows users to query the map data in a number of different ways (e.g. `output all of the genes known to map between 35B1 and 35C1 on the polytene chromosomes', `output all of the deletions that uncover 35B1', `output all of the cosmid clones on the X chromosome').

A graphical map tool that allows the display of selected classes of object (genes, aberrations, clones) from an image that represents the chromosomes can be accessed through either FlyBase server (see below) with a graphics-capable browser such as Mosaic or Netscape. Graphical maps of genes, clones and clone contigs are incorporated in the Encyclopaedia of Drosophila .

Bibliographic references

A key feature of FlyBase is a comprehensive bibliography of conventional and unconventional (e.g. films, archival material) publications on the family Drosophilidae, covering all aspects of study. This includes the complete texts of all of the published Drosophila bibliographies and information from major external resources, such as MEDLINE, BIOSIS, the Zoological Record and the Environmental Mutagen Information Center (by permission). The bibliography is updated from these and other sources. To ensure consistency there is a satellite file of all `multi-publication' sources, for example journals and edited publications, which includes full names, dates and places of publication, volume number ranges and ISBNs or ISSNs and CODENS. By far the greater part of these data have been checked on the Library of Congress and University of Cambridge online catalogs. Bibliographic records are coded as to type (e.g. journal article, thesis, book, film). As of October 1995 the number of bibliographic records was nearly 75 000. This includes >4000 theses.

Nature of gene product(s)

FlyBase uses a controlled vocabulary to summarise the (molecular) nature of a gene's product. For enzymes, the EC names and EC numbers are included; for non-enzyme proteins a `trivial' name (e.g. actin, calmodulin) is used but the data are redundant so as to ease access. At present the data are a mixture of classification by function, or, more often, inferred function (e.g. transcription factor), classification by structure (e.g. homeodomain protein) and classification by both structure and function (e.g. tRNA). These fields also store cross-references to the PROSITE database, by PROSITE name and number.

FlyBase is now working with others to construct a hierarchical classification of the functions of gene products for use in genomic databases.

Database cross references

FlyBase extensively cross references its objects with those in other genetic and molecular databases. FlyBase receives daily updates of new and revised records from the EMBL/DDBJ/GenBank databases and stores their primary accession numbers in the gene, allele or aberration records. FlyBase also stores cross-references (by accession no.) to both SwissProt and PIR, as well as to the Eukaryotic Promoter Database (EPD), dbSTS, dbEST, TRANSFAC, PDB, NRL_3D and GCR databases. FlyBase now includes >5300 accession no. cross-references to the EMBL/DDBJ/GenBank database, 900 to SwissProt and 1800 to PIR. FlyBase provides these external databases with flat file tables of their accession numbers linked to FlyBase accession numbers, encouraging reciprocal DBXREF links.

Molecular data

FlyBase is developing reports and query tools for exploring molecular data. These reports will include the molecular organization of genes and detailed information on transcript and protein products.

FlyBase collaborates very closely with both the Berkeley and European Drosophila Genome Projects. An integrated list of P1, cosmid and YAC clones from these projects is available and can be searched by cytological location.

A prototype report for P-element and vector constructs has recently been incorporated within the SymbolSearch tool (see below). These reports provide graphical maps of transposons and plasmids linked to sequence data and, where appropriate, to mutant alleles and publications. For each transposon and vector there are links to the components used in its construction. At present the data are only complete for 142 P-element transposons (and their progenitors and components) found in stocks at the Bloomington Drosophila Stock Center. FlyBase is now extending curation to all published transposons used by Drosophila biologists.

Similar genes in other organisms

One of the most urgent needs for those building genetic databases is a stable mechanism to cross-reference genes (and other objects) between organisms. In the absence of such a mechanism FlyBase now simply includes the gene symbol and organism of loci said, by investigators, to encode a similar (or homologous) product.

Stock lists

FlyBase provides access to the stock lists of the three major stock centers for D.melanogaster (Bloomington, Mid-America and Umea) and for that of the Drosophila species stock center at Bowling Green. It also provides access to the stock lists of individual laboratories, if these are provided to FlyBase. FlyBase works with the major Drosophila stock centers to ensure consistency of nomenclature.

People directory

FlyBase maintains a directory of names, addresses, telephone and fax numbers and e-mail addresses of people in the Drosophila community. Those with particular roles in the community (e.g. principal investigators, stock-keepers, members of the Drosophila Board) are tagged. There are now >4900 records in this directory.

Allied data

FlyBase cannot, and should not, be wholly comprehensive. We encourage others to build specialised databases. At present FlyBase offers help in linking these to FlyBase (by the use of FB identifiers, for example) and in making these available through the FlyBase servers. Several databases of allied data are now available: these include a complete list of valid species in the family Drosophilidae (Dr G. Bachli, Zurich), a catalog of polytene chromosome sites that are recognised by antibodies (Dr S. Amero, Chicago, IL), a Drosophila codon usage table and the Drosophila records of the Transcription Termination Signal Database. All Drosophila records of the Environmental Mutagen Information Center are available from FlyBase, as allied data.

FlyBase has a depository for images (flybase/allied-data/images). A project, with Dr N. Patel, to capture images of enhancer trap lines is underway. These images will be linked to other objects (e.g. stocks, alleles) in FlyBase.

Although not allied data, FlyBase makes the complete unchanged text of Lindsley and Zimm ( 3 ) available (by permission of Academic Press) and keeps a file of errors in this book that have been noticed. The text of the earlier Lindsley and Grell ( 2 ) is also available on FlyBase.

IMPLEMENTATION

FlyBase is built with a relational database management system (Sybase). The present schema has been implemented for most of the data and most files accessed via the FlyBase servers are the products of the Sybase tables. The schema is now being extended to accommodate physical maps and sequences from the major Drosophila genome projects.

FlyBase data are maintained by curators working from the literature and filling in a standard form that is parsed into the Sybase tables.

ACCESS

FlyBase provides users with a variety of modes of access: WWW, gopher, ftp of flat files and, via the Encyclopaedia of Drosophila, which uses a version ACeDB.

FlyBase is currently accessible at three sites, Harvard and Indiana Universities. The FlyBase WWW server at Harvard (see below for addresses) gives access to the search tools CytoSearch (for searching on the basis of cytological map position) and SymbolSearch (for searching by gene, aberration or transposon symbol, with full support of wild cards) as well as access to the full set of FlyBase data services available through the Indiana server. The Harvard server only supports http, and therefore requires a browser such as Mosaic, Netscape or Lynx. The server at Indiana supports multiple protocols, including http, Gopher and ftp and is accessible by a wide range of clients. Structured flat files that are output from Sybase are available to query, copy or browse. Users of interactive clients (e.g. Gopher+, Mosaic, Netscape) can request stocks, update or add to the directory of Drosophila workers, and send e-mail to the FlyBase consortium from within FlyBase.

The flat files derived from the Sybase tables are often available in several formats, as well as being indexed for queries. For example, the bibliography is available in Unix REFER format (which can be used by many bibliographic packages) as well as in text and `comma separated values' formats. The genetic data are available in readable text formats and in a format in which different fields are coded (the latter allow users to write simple code to construct their own queries on the data).

FlyBase publishes a subset of the data in printed form as special issues of the Drosophila Information Service. Two such issues were published in June 1994: DIS 73 includes data on gene loci, gene function and gene and allele synonyms; DIS 74 is a bibliography of the Drosophila literature for the period 1982-1993.

FlyBase and the Berkeley Drosophila Genome Project jointly publish The Encyclopaedia of the Drosophila Genome (version 2, October 1995). This presents a merge of the information in FlyBase with the data of the Berkeley project viewable via an ACeDB client. This collaborative project has involved the customisation of ACeDB for Drosophila (by Suzanna Lewis in Berkeley), a port of ACeDB to Mac platforms (by Cyrus Harmon in Berkeley) and an interface between Sybase and ACeDB (by Eddy Welbourne in Cambridge). The Encyclopaedia is available from FlyBase as a CD-ROM (for Macs) or by ftp from the Indiana FlyBase server for Unix and Mac platforms.

Interaction with the user community is vital for the success of FlyBase. We encourage the submission of new data, the correction of errors and ideas for making this database of even greater use to the community.

DOCUMENTATION

A complete FlyBase Reference Manual is available from FlyBase servers as either text or Postscript files (flybase/docs/Reference-manual.txt; flybase/docs/Reference-manual.ps). A shorter User Manual is also available (flybase/docs/User-manual.txt and .ps) as is a brief introduction `About FlyBase' (flybase/About-flybase).

News about changes to FlyBase is posted to the bionet.drosophila news group.

REFERENCING FLYBASE

We suggest that FlyBase be referenced as follows:

FlyBase (1995). FlyBase - The Drosophila Database. Available from the flybase.bio.indiana.edu network server and Gopher site and at the URL http://morgan.harvard.edu/. Nucleic Acids Res. , 24 , 53-56.

We suggest that the abbreviation FB be used for FlyBase, regardless of the particular FlyBase product.

ADDRESSES

The Harvard FlyBase server has the URL

http://morgan.harvard.edu/.

The Indiana FlyBase server has the URLs http://flybase.bio.indiana.edu:82/ and gopher://flybase.bio.indiana.edu:72/) for use with WWW browsers. The gopher server can be addressed from gopher clients at flybase.bio.indiana.edu. These are mirrored at the European Bioinformatics Institute (http://www.ebi.ac.uk/flybase/).

FTP to flybase.bio.indiana.edu (129.79.225.25) with the username anonymous and your e-mail address as password. FlyBase is in the directory /flybase.

A CD-ROM of the Encyclopaedia of Drosophila (for Macs) can be purchased at nominal cost from Ms D. Palmer, Biological Laboratories, 16 Divinity Avenue, Harvard University, Cambridge, MA 02138, USA (FAX +1 617 495 9300).

The Encyclopaedia of Drosophila is available for Unix systems (Sun, SGI and DEC Alpha) and for Macs by ftp from flybase.bio.indiana.edu (login with username eofd and password FlyBase).

Questions about the Indiana FlyBase server may be addressed to flybase{at}bio.indiana.edu.

Requests for help and questions about FlyBase should be addressed to flybase-help{at}morgan.harvard.edu. Reports of errors in FlyBase or data updates, should be addressed to flybase-updates{at}morgan.harvard.edu. Mail may be addressed to FlyBase, Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA.

ACKNOWLEDGEMENTS

FlyBase is supported by a grant from the National Institutes of Health (National Center for Human Genome Research). It has also been supported by a grant from the Medical Research Council, London. John Merriam (UCLA) was a member of the consortium until July 1994. We thank him for his invaluable contributions.

REFERENCES

1 FlyBase (1995) Drosophila melanogaster. In Stewart, A. (ed.), Genetic nomenclature guide. Trends Genet., (suppl.), 26-29.

2 Lindsley, D.L. and Grell, E.H.(1968) Genetic variations of Drosophila melanogaster. Carnegie Institution, Washington, DC.

3 Lindsley, D.L. and Zimm, G. (1992). The Genome of Drosophila melanogaster. Academic Press, San Diego, CA.


Return

*The current members of the FlyBase consortium are: W. M. Gelbart, W. P. Rindone, J. Chillemi, S. Russo, M. Crosby and B. Matthews, Biological Laboratory, Harvard University, Cambridge, MA, USA, M. Ashburner, R. A. Drysdale, A. de Grey and E. J. Whitfield, Department of Genetics, Cambridge University, Cambridge, UK, T. Kaufman, K. Matthews and D. Gilbert, Department of Biology, Indiana University, Bloomington, IN, USA and C. Tolstoshev, NCBI, Bethesda, MD, USA
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J. Cell Biol.Home page
J. L. Genova and R. G. Fehon
Neuroglian, Gliotactin, and the Na+/K+ ATPase are essential for septate junction function in Drosophila
J. Cell Biol., June 9, 2003; 161(5): 979 - 989.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A.-S. Chiang, W.-Y. Lin, H.-P. Liu, M. A. Pszczolkowski, T.-F. Fu, S.-L. Chiu, and G. L. Holbrook
Insect NMDA receptors mediate juvenile hormone biosynthesis
PNAS, January 1, 2002; (2002) 12318899.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
J. M. Ranz, F. Casals, and A. Ruiz
How Malleable is the Eukaryotic Genome? Extreme Rate of Chromosomal Rearrangement in the Genus Drosophila
Genome Res., February 1, 2001; 11(2): 230 - 239.
[Abstract] [Full Text]


Home page
GeneticsHome page
K. A. Wharton, J. M. Cook, S. Torres-Schumann, K. de Castro, E. Borod, and D. A. Phillips
Genetic Analysis of the Bone Morphogenetic Protein-Related Gene, gbb, Identifies Multiple Requirements During Drosophila Development
Genetics, June 1, 1999; 152(2): 629 - 640.
[Abstract] [Full Text]


Home page
Proc. Natl. Acad. Sci. USAHome page
C. Flores and W. Engels
Microsatellite instability in Drosophila spellchecker1 (MutS homolog) mutants
PNAS, March 16, 1999; 96(6): 2964 - 2969.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. Kraft, T. Säll, I. Magnusson-Rading, N.-O. Nilsson, and C. Halldén
Positive Correlation Between Recombination Rates and Levels of Genetic Variation in Natural Populations of Sea Beet (Beta vulgaris subsp. maritima)
Genetics, November 1, 1998; 150(3): 1239 - 1244.
[Abstract] [Full Text]


Home page
GeneticsHome page
R. E. Nicholls and W. M. Gelbart
Identification of Chromosomal Regions Involved in decapentaplegic Function in Drosophila
Genetics, May 1, 1998; 149(1): 203 - 215.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
G. A. Helt, S. Lewis, A. E. Loraine, and G. M. Rubin
BioViews: Java-Based Tools for Genomic Data Visualization
Genome Res., March 1, 1998; 8(3): 291 - 305.
[Abstract] [Full Text]


Home page
DevelopmentHome page
O Khalsa, J. Yoon, S Torres-Schumann, and K. Wharton
TGF-beta/BMP superfamily members, Gbb-60A and Dpp, cooperate to provide pattern information and establish cell identity in the Drosophila wing
Development, January 7, 1998; 125(14): 2723 - 2734.
[Abstract] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S.-F. Tsai, C.-C. Jang, G. G. Prikhod'ko, D. A. Bessarab, C.-Y. Tang, G. O. Pflugfelder, and Y. H. Sun
Gypsy retrotransposon as a tool for the in vivo analysis of the regulatory region of the optomotor-blind gene in Drosophila
PNAS, April 15, 1997; 94(8): 3837 - 3831.
[Abstract] [Full Text] [PDF]


Home page
DevelopmentHome page
S. Russell, N Sanchez-Soriano, C. Wright, and M Ashburner
The Dichaete gene of Drosophila melanogaster encodes a SOX-domain protein required for embryonic segmentation
Development, January 11, 1996; 122(11): 3669 - 3676.
[Abstract] [PDF]


Home page
DevelopmentHome page
V Twombly, R. Blackman, H Jin, J. Graff, R. Padgett, and W. Gelbart
The TGF-beta signaling pathway is essential for Drosophila oogenesis
Development, January 5, 1996; 122(5): 1555 - 1565.
[Abstract] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A.-S. Chiang, W.-Y. Lin, H.-P. Liu, M. A. Pszczolkowski, T.-F. Fu, S.-L. Chiu, and G. L. Holbrook
Insect NMDA receptors mediate juvenile hormone biosynthesis
PNAS, January 8, 2002; 99(1): 37 - 42.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (37K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?