| Nucleic Acids Research | Pages |
FlyBase: a Drosophila database
Background
Scope
FlyBase identifier numbers
Genes, alleles and aberrations
Nomenclature and synonyms
Map data
Bibliographic references
Nature of gene product(s)
Molecular data
Database cross-references
Similar genes in other organisms
Stock lists
People directory
Allied data
Future PLANS
Implementation
Access
Documentation
Addresses
Referencing Flybase
Acknowledgements
References
FlyBase: a Drosophila database
ABSTRACT
BACKGROUND
Drosophila melanogaster is one of the most studied eukaryotic organisms. Introduced to modern biology in the early years of this century, research with D.melanogaster has been at the forefront of most areas of biology, from genetics to ecology, from neurobiology to evolution. Drosophila geneticists have been well served by a series of catalogs of mutations, the first of which was published in 1925, and by regular publication (again, dating from 1925) of bibliographies of the Drosophila literature. The last conventional catalog of the genes and mutations of D.melanogaster was published in 1992 (1).
Since October 1992, the National Center for Human Genome Research of the NIH (now the National Human Genome Research Institute) has funded the FlyBase project with the objective of providing a database of genetic and molecular information concerning this insect. FlyBase also receives support from the Medical Research Council, London. From January 1 1998 the FlyBase Consortium will include members of both the Berkeley and European Drosophila Genome Projects (see below).
SCOPE
The core of FlyBase is data concerning the genes and mutations of Drosophila:
- Gene name; gene symbol; synonyms; genetic map position; polytene chromosome map position; nature of gene product(s); molecular data; gene expression pattern data; similar genes in other organisms; database cross-references.
- Allele(s) name; allele(s) symbol; allele(s) synonyms; origin of allele; phenotypic information; molecular data.
- Chromosome aberrations.
- Clones (cosmids, P1s, YACs).
- Molecular data, including molecular maps and transcripts.
- Transposons, transgenes and their genomic insertions.
- Bibliographic references.
- Stock lists.
- People.
- Allied data sets.
- FlyBase data sets
In their present form these data combine information in a highly structured, parseable form and free text. All of the data are available as flat (ASCII) files, the majority being the output of selected data sets from the relational database implementation of FlyBase.
The taxonomic scope of FlyBase is the family Drosophilidae.
FlyBase identifier numbers
All data classes listed below have unique identifiers in FlyBase. These allow them to be referenced both within FlyBase and externally. FlyBase identifiers are of the form: FBxxnnnnnnn, where xx is a two-letter code signifying the type of identifier, and nnnnnnn is a 7-digit number padded with leading zeros. Identifier codes now used are:
| FBgn | gene identifier (eg FBgn0001234) |
| FBal | allele identifier |
| FBab | aberration identifier |
| FBrf | bibliographic reference identifier |
| FBs | pspecies identifier |
| Fbmc | construct identifier |
| FBba | balancer identifier |
| FBtp | engineered transposon |
| FBti | transposon insertion identifier |
| FBtr | transcript identifier |
| FBpr | protein identifier |
| Fbms | molecular component identifier |
Genes, alleles and aberrations
As of September 1, 1997, FlyBase included information on 15 401 genes (9704 D.melanogaster), 50 624 alleles (39 622 D.melanogaster) and 15 413 chromosome aberrations (12 809 D.melanogaster). Except for some historical data, inherited from Lindsley and Zimm (1) for example, all data are attributed to a single publication (including personal communications to FlyBase; these are archived and made accessible to users). Search tools allow data on genes, their alleles and aberrations to be retrieved by a simple query. The data are a mixture of free text and controlled syntax. FlyBase uses a standard controlled vocabulary of terms to describe, for example, mutagens and anatomical parts of Drosophila.
Nomenclature and synonyms
The genetic nomenclature of D.melanogaster is chaotic (though perhaps not in the technical sense of this word). FlyBase has written and maintains a document on nomenclatural standards for the community. The synonymy of Drosophila gene, allele and aberration names is very extensive. FlyBase attempts to record all synonyms (40 451 as of September 1, 1997) and search tools are designed to allow the recovery of records by synonym.
Map data
All map data are stored in FlyBase in a common form, regardless of whether these data are genetic, cytogenetic or molecular. This allows FlyBase to output integrated maps in a variety of formats.
A major project on the analysis of map data is now complete and this allows the automatic generation of genetic and cytogenetic maps and the identification of data conflicts.
Tools have been written to output map information in a variety of forms, including tabular and graphical. Querying the maps by position is an important feature for users of FlyBase. The CytoSearch tool allows several forms of query, e.g. `output all of the genes known to map between 35B1 and 35C1 on the polytene chromosomes', `output all of the deletions that uncover 35B1', `output all of the cosmid clones on the X chromosome'. Maps can also be viewed using a graphical tool that allows the display of selected classes of object (genes, aberrations, clones, transposon insertions) from images that represent the chromosomes.
For a growing number of genes graphical maps are available. These display gene structure, sites of transposon insertion, aberration breakpoints, limits of transformation rescue fragments, etc.
Bibliographic references
A key feature of FlyBase is a comprehensive bibliography of conventional and unconventional publications (e.g., films, archival material and even newspaper articles) on the family Drosophilidae, covering all aspects of its study. This bibliography includes the complete texts of all of the published Drosophila bibliographies, and information from major external resources, such as MEDLINE, BIOSIS, the Zoological Record and the Environmental Mutagen Information Center (by permission). The bibliography is updated from these and other sources. To ensure consistency there is a satellite file of all `multi-publication' sources, e.g. journals and edited publications, which includes full names, dates and places of publication, volume number ranges, and ISBNs or ISSNs and CODENS. By far the greater part of these data have been checked on the Library of Congress and British Library online catalogs. Bibliographic records are coded as to type (e.g., journal article, abstract, review, thesis, book, film). As of September 1, 1997, the number of bibliographic records was 88 206, including 4491 theses and 19 764 abstracts.
FlyBase maintains a collection of offprints of publications on Drosophila, housed in Cambridge, UK. This collection (>32 000 items) is cross-referenced with the bibliography, and copies of obscure publications can be supplied on request.
Nature of gene product(s)
FlyBase classifies the nature of a gene's product in two ways: `structure' and `function'. For `structure' FlyBase relies on cross-references to the PROSITE database. If no such cross-reference exists then FlyBase uses a vocabulary modelled on that of PROSITE to give an indication of the structural feature(s) of a gene's product. For `function' FlyBase uses the EC name and EC number for those products that are enzymes (and are included in the ENZYME database). For products that are not enzymes (or are not included in the ENZYME database) FlyBase uses a controlled vocabulary to summarise the (molecular) nature of a gene's product.
FlyBase has developed, with others, a hierarchical classification of biological processes to be used to classify gene functions. It is hoped that this will soon be implemented.
To describe the location of gene products and mutant phenotypes FlyBase has developed a very extensive hierarchical controlled vocabulary of the gross and sub-cellular anatomy of Drosophila.
Molecular data
FlyBase curates information on the molecular organization of genes and detailed information on transcripts and protein products and their expression. The expression pattern curation uses a controlled vocabulary for the description of anatomy and life stages. Search tools for expression patterns are under development. These data can be accessed via FlyBase gene reports.
FlyBase is developing tools for the curation of sequences. These will allow the capture of sequence-related information, both from the primary sequence records and from the literature. For those regions of the genome sequenced by the Berkeley or European Drosophila Genome Projects, their sequences will form the backbone of a virtual sequence of D.melanogaster.
FlyBase collaborates very closely with both the Berkeley and European Drosophila Genome Projects. An integrated list of P1, cosmid and YAC clones from these projects is available and can be searched by cytological location (see Future Plans, below).
FlyBase curates the structure of artificial constructs (including plasmids, vectors and constructs used for transformation). These data are reported via graphical maps of transposons and plasmids linked to sequence data and, where appropriate, to mutant alleles and publications. For each transposon and vector there are links to the components used in its construction.
Database cross-references
FlyBase extensively cross-references its objects with those in other genetic and molecular databases. FlyBase receives updates of new and revised records from the DDBJ/EMBL/GenBank databases and stores their primary accession numbers and Protein Identifier Numbers (PIDs) in the gene, allele or aberration records. FlyBase also stores cross-references (by accession number) to SwissProt, TREMBL and PIR, as well as to the Eukaryotic Promoter Database (EPD), dbSTS, dbEST, TRANSFAC, PDB, NRL_3D and GCR databases. FlyBase now includes >7050 accession number cross-references to the DDBJ/EMBL/GenBank database, 2576 to SwissProt and TREMBL and 1941 to PIR. FlyBase also cross-links to other genetic databases (see below). FlyBase provides these external databases with flat file tables of their accession numbers linked to FlyBase accession numbers, encouraging reciprocal DBXREF links.
Similar genes in other organisms
One of the most urgent needs for those building genetic databases is a stable mechanism to cross-reference genes (and other objects) between organisms. In the absence of such a mechanism FlyBase now simply includes the gene symbol and organism of loci said, by investigators, to encode a similar (or homologous) product. These cross-references (3077 as of September 1, 1997) include the gene symbol approved by the appropriate community (e.g., HUGO) and, where possible, the gene's accession number in the appropriate database (OMIM, GDB, MGD, ECOGENE, Saccharomyces Genome Database). Some of these links (e.g., with GDB) are now reciprocal. A major joint project with the Mouse Genome Database has, in 1997, greatly improved the links between FlyBase and MGD.
Stock lists
FlyBase provides access to the stock lists of the two major stock centers for D.melanogaster (Bloomington and Umea) and for that of the Drosophila species stock center at Bowling Green. It also provides access to the stock lists of individual laboratories, if these are provided to FlyBase. FlyBase works with the major Drosophila stock centers to ensure consistency of nomenclature.
People directory
FlyBase maintains a directory of names, addresses, telephone and fax numbers, email addresses and URLs of people in the Drosophila community. Those with particular roles in the community (e.g., principal investigators, stock-keepers, members of the Drosophila Board) are tagged. There are now >5400 records in this directory.
Allied data
FlyBase cannot, and should not, be wholly comprehensive. We encourage others to build specialised databases. At present FlyBase offers help in linking these to FlyBase (by the use of FB identifiers, for example) and in making these available through the FlyBase servers. Several databases of allied data are now available through FlyBase: these include a complete list of valid species in the family Drosophilidae (Dr G. Baechli, Zurich), a Drosophila codon usage table and the Drosophila records of the Transcription Termination Signal Database. All Drosophila records of the Environmental Mutagen Information Center are also available.
FlyBase has a depository for images (flybase/allied-data/images).
Although not allied data, FlyBase makes the complete unchanged text of Lindsley and Zimm (1) available (by permission of Academic Press) and keeps a file of errors in this book that have been noticed. The text of the earlier Lindsley and Grell (2) is also available on FlyBase. There is also a file of errors noticed in Ashburner's Drosophila. A Laboratory Handbook and Manual (Cold Spring Harbor, 1989).
The Interactive Fly (developed and maintained by Tom and Judy Brody) is an Allied database of cellular and developmental pathways. FlyBase has developed a hierarchical browser for the Interactive Fly, allowing access to FlyBase genes based on their cellular/developmental relationships.
FUTURE PLANS
From January 1, 1998, a new, enlarged, FlyBase Consortium will be formed. This will include members from the informatics teams of both the Berkeley and European Drosophila Genome Projects. FlyBase will become the single public view of data from these projects. In particular, the enlarged FlyBase Consortium will provide, through a newly designed database, graphical views of annotated genomic sequence data. These will provide a close integration of genetic and sequence data.
IMPLEMENTATION
FlyBase is built with a relational database management system (Sybase). The present schema has been implemented for most of the data and most files accessed via the FlyBase servers are the products of the Sybase tables.
FlyBase data are maintained by curators working from the literature and filling in standard forms that are parsed into the Sybase tables.
ACCESS
FlyBase provides users with a variety of modes of access: http, gopher, e-mail, ftp of flat files.
The primary FlyBase server has the following addresses:
| http://flybase.bio.indiana.edu/ | http access |
| flybase.bio.indiana.edu 72 | gopher access |
| flybase.bio.indiana.edu (in /flybase) | ftp access |
| flybase-gopher{at}indiana.edu | Email access |
| Mirror sites are available in Europe, Japan, Australia and the USA. The major mirror sites are now: | |
| http://www.embl-ebi.ac.uk/flybase/ | http access |
| gopher.embl-ebi.ac.uk 7071 | gopher access |
| ftp.embl-ebi.ac.uk (in /pub/databases/flybase) | ftp access |
| http://astorg.u-strasbg.fr:7081/ | http access |
| astorg.u-strasbg.fr:7071/1 | gopher access |
| http://www.angis.su.oz.au:7081/ | http access |
| http://shigen.lab.nig.ac.jp:7081/ | http access |
| shigen.lab.nig.ac.jp 7071 | gopher access |
| http://cbbridges.harvard.edu:7081/ | http access |
| cbbridges.harvard.edu 7071 | gopher access |
The flat files derived from the Sybase tables are often available in several formats, as well as being indexed for SRS queries. For example, the bibliography is available in Unix REFER format (which can be used by many bibliographic packages) as well as in text and `comma-separated-values' formats. The genetic data are available in readable text formats and in a format in which different fields are coded (the latter allow users to write simple code to construct their own queries on the data).
FlyBase publishes a subset of the data in printed form as special issues of Drosophila Information Service (DIS). Two such issues were published in April 1997. DIS 78 includes gene-order maps, and maps of transposons and vectors; DIS 79 includes a bibliography of publications on Drosophila (1994-1996 and supplement for 1982-1993), Drosophila nomenclatural guidelines and the controlled-vocabulary of anatomical terms for Drosophila.
Interaction with the user community is vital for the success of FlyBase. We encourage the submission of new data, the correction of errors, and ideas for making this database of even greater use to the community.
DOCUMENTATION
A complete FlyBase Reference Manual is available from FlyBase servers in a variety of formats (html, rtf, Postscript and text). A brief introduction, `Getting started with FlyBase', is also available.
Announcements of major database updates and concerning the release of new tools are made through postings to the bionet.drosophila newsgroup. FlyBase users are encouraged to use this newsgroup to track changes to FlyBase.
ADDRESSES
Requests for help and questions about FlyBase should be addressed to flybase-help{at}morgan.harvard.edu. Reports of errors in FlyBase, or data updates, should be addressed to flybase-updates{at}morgan.harvard.edu.
Mail may be addressed to FlyBase, Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA.
REFERENCING FLYBASE
We suggest that FlyBase be referenced as follows:
FlyBase (1998) FlyBase: the Drosophila Database. Nucleic Acids Res. 26, 85-88. Available from http://flybase.bio.indiana.edu/
We suggest that the abbreviation FB be used for FlyBase, regardless of the particular FlyBase product.
ACKNOWLEDGEMENTS
FlyBase is supported by grants from the National Institutes of Health (National Human Genome Research Institute) and the Medical Research Council, London. The Berkeley Drosophila Genome Project is supported by a grant from National Institutes of Health (National Human Genome Research Institute) to G.M. Rubin; the European Drosophila Genome Project is supported by a contract from the European Union (coordinated by D.M. Glover).
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals Comments and feedback: www-admin{at}oup.co.uk
Last modification: 16 Dec 1997
Copyright© Oxford University Press, 1998.
This article has been cited by other articles:
![]() |
J. M. Comeron and T. B. Guthrie Intragenic Hill-Robertson Interference Influences Selection Intensity on Synonymous Mutations in Drosophila Mol. Biol. Evol., December 1, 2005; 22(12): 2519 - 2530. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Solignac, D. Vautrin, E. Baudry, F. Mougel, A. Loiseau, and J.-M. Cornuet A Microsatellite-Based Linkage Map of the Honeybee, Apis mellifera L. Genetics, May 1, 2004; 167(1): 253 - 262. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. C. Lai Drosophila Tufted Is a Gain-of-Function Allele of the Proneural Gene amos Genetics, April 1, 2003; 163(4): 1413 - 1425. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Beumer, H. J. G. Matthies, A. Bradshaw, and K. Broadie Integrins regulate DLG/FAS2 via a CaM kinase II-dependent pathway to mediate synapse elaboration and stabilization during postembryonic development Development, March 9, 2003; 129(14): 3381 - 3391. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Lieber, S. Kidd, and M. W. Young kuzbanian-mediated cleavage of Drosophila Notch Genes & Dev., January 15, 2002; 16(2): 209 - 221. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Aoyagi and D. A. Wassarman Developmental and Transcriptional Consequences of Mutations in Drosophila TAFII60 Mol. Cell. Biol., October 15, 2001; 21(20): 6808 - 6819. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Vidal, R. S. Khush, F. Leulier, P. Tzou, M. Nakamura, and B. Lemaitre Mutations in the Drosophila dTAK1 gene reveal a conserved function for MAPKKKs in the control of rel/NF-{kappa}B-dependent innate immune responses Genes & Dev., August 1, 2001; 15(15): 1900 - 1912. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Marais, D. Mouchiroud, and L. Duret Does recombination improve selection on codon usage? Lessons from nematode and fly complete genomes PNAS, April 18, 2001; (2001) 91427698. [Abstract] [Full Text] |
||||
![]() |
J. M. Comeron and M. Kreitman The Correlation Between Intron Length and Recombination in Drosophila: Dynamic Equilibrium Between Mutational and Selective Forces Genetics, November 1, 2000; 156(3): 1175 - 1190. [Abstract] [Full Text] |
||||
![]() |
E. Lai, R Bodner, J Kavaler, G Freschi, and J. Posakony Antagonism of notch signaling activity by members of a novel protein family encoded by the bearded and enhancer of split gene complexes Development, January 1, 2000; 127(2): 291 - 306. [Abstract] [PDF] |
||||
![]() |
B. Iyengar, J. Roote, and A. R. Campos The tamas Gene, Identified as a Mutation That Disrupts Larval Behavior in Drosophila melanogaster, Codes for the Mitochondrial DNA Polymerase Catalytic Subunit (DNApol-{gamma}125) Genetics, December 1, 1999; 153(4): 1809 - 1824. [Abstract] [Full Text] |
||||
![]() |
M. Gaszner, J. Vazquez, and P. Schedl The Zw5 protein, a component of the scs chromatin domain boundary, is able to block enhancer-promoter interaction Genes & Dev., August 15, 1999; 13(16): 2098 - 2107. [Abstract] [Full Text] |
||||
![]() |
J. J. Sekelsky, K. S. McKim, L. Messina, R. L. French, W. D. Hurley, T. Arbel, G. M. Chin, B. Deneen, S. J. Force, K. L. Hari, et al. Identification of Novel Drosophila Meiotic Genes Recovered in a P-Element Screen Genetics, June 1, 1999; 152(2): 529 - 542. [Abstract] [Full Text] |
||||
![]() |
M. A. Crosby, C. Miller, T. Alon, K. L. Watson, C. P. Verrijzer, R. Goldman-Levi, and N. B. Zak The trithorax Group Gene moira Encodes a Brahma-Associated Putative Chromatin-Remodeling Factor in Drosophila melanogaster Mol. Cell. Biol., February 1, 1999; 19(2): 1159 - 1170. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Beumer, J Rohrbough, A Prokop, and K Broadie A role for PS integrins in morphological growth and synaptic function at the postembryonic neuromuscular junction of Drosophila Development, January 12, 1999; 126(24): 5833 - 5846. [Abstract] [PDF] |
||||
![]() |
C. Byars, K. Bates, and A Letsou The dorsal-open group gene raw is required for restricted DJNK signaling during closure Development, January 11, 1999; 126(21): 4913 - 4923. [Abstract] [PDF] |
||||
![]() |
L. Li and J. Gergen Differential interactions between Brother proteins and Runt domain proteins in the Drosophila embryo and eye Development, January 8, 1999; 126(15): 3313 - 3322. [Abstract] [PDF] |
||||
![]() |
J Kavaler, W Fu, H Duan, M Noll, and J. Posakony An essential role for the Drosophila Pax2 homolog in the differentiation of adult sensory organs Development, January 5, 1999; 126(10): 2261 - 2272. [Abstract] [PDF] |
||||
![]() |
W. Chou, A Huber, J Bentrop, S Schulz, K Schwab, L. Chadwell, R Paulsen, and S. Britt Patterning of the R7 and R8 photoreceptor cells of Drosophila: evidence for induced and default cell-fate specification Development, January 2, 1999; 126(4): 607 - 616. [Abstract] [PDF] |
||||
![]() |
S. J. Newfeld and N. T. Takaesu Local Transposition of a hobo Element Within the decapentaplegic Locus of Drosophila Genetics, January 1, 1999; 151(1): 177 - 187. [Abstract] [Full Text] |
||||
![]() |
J. M. Comeron, M. Kreitman, and M. Aguadé Natural Selection on Synonymous Sites Is Correlated With Gene Length and Recombination in Drosophila Genetics, January 1, 1999; 151(1): 239 - 249. [Abstract] [Full Text] |
||||
![]() |
L. D. Stein and J. Thierry-Mieg Scriptable Access to the Caenorhabditis elegans Genome Sequence and Other ACEDB Databases Genome Res., December 1, 1998; 8(12): 1308 - 1315. [Abstract] [Full Text] |
||||
![]() |
S. Kidd, T. Lieber, and M. W. Young Ligand-induced cleavage and regulation of nuclear entry of Notch in Drosophila melanogaster embryos Genes & Dev., December 1, 1998; 12(23): 3728 - 3740. [Abstract] [Full Text] |
||||
![]() |
Y. Yasukochi A Dense Genetic Map of the Silkworm, Bombyx mori, Covering All Chromosomes Based on 1018 Molecular Markers Genetics, December 1, 1998; 150(4): 1513 - 1525. [Abstract] [Full Text] |
||||
![]() |
L. G. Tilney, P. S. Connelly, K. A. Vranich, M. K. Shaw, and G. M. Guild Why Are Two Different Cross-linkers Necessary for Actin Bundle Formation In Vivo and What Does Each Cross-link Contribute? J. Cell Biol., October 5, 1998; 143(1): 121 - 133. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Soriano and S Russell The Drosophila SOX-domain protein Dichaete is required for the development of the central nervous system midline Development, January 10, 1998; 125(20): 3989 - 3996. [Abstract] [PDF] |
||||
![]() |
G. Marais, D. Mouchiroud, and L. Duret Does recombination improve selection on codon usage? Lessons from nematode and fly complete genomes PNAS, May 8, 2001; 98(10): 5688 - 5692. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||







