| Nucleic Acids Research | Pages |
LABNOTE, a laboratory notebook system designed for academic genomics groups
Introduction
Software And Database
Database management system
Implementation
Description of database features
Examples of data access procedures
Discussion
Availability Of Software
Acknowledgements
References
LABNOTE, a laboratory notebook system designed for academic genomics groups
ABSTRACT
INTRODUCTION
Many public databases have been established to store and make available all kinds of genomic data, from maps to sequences through catalogues of mutants and protein motifs (1). Recent efforts have been aimed, in particular, at making gene expression data publicly available and at the same time providing users with data analysis tools (2). In addition, large laboratories, such as Genome Centers involved in intensive genome mapping or sequencing, have set up their own database systems to store their results and prepare the data for distribution to the scientific community. Being geared to a particular operation, such systems are rarely made available or even published. One of them, the Genome Notebook, developed primarily to handle results from the human chromosome 11 mapping project, has however been described in some detail (3).
Increasingly, conventional laboratories (i.e. groups of relatively small size operating in an academic environment) are interfacing with the genome project and making use of its results (e.g. sequence or mapping data), but also of resources such as the IMAGE cDNA clone set (4), and of semi-automated procedures that boost throughput by one or two orders of magnitude. This trend results in a large increase in the number of objects and in the amount of information, requiring efficient archiving and easy retrieval of experiments in progress, intermediate results and final data. The traditional approach, i.e. manual notebooks supplemented by a number of computer files in spreadsheet software, can no longer cope with this data flow, and a proper laboratory database becomes necessary.
A number of projects developed in our Institute are centred around genes expressed in the mouse thymus. We use organised cDNA libraries, measure expression levels by hybridisation of DNA arrays with complex probes, and obtain additional information (tag sequence, genome mapping, etc.) for sets of clones selected according to their expression pattern (5-8). Thus the information we wish to store in a laboratory database is largely organised around a list of clones (expression data, sequences, results from Southerns and northerns, etc.) but also includes the description of libraries, the make-up of specific arrays, as well as protocols or publication references. Other ongoing research programmes use sets of a few hundred IMAGE cDNA clones as reagents for expression profiling in various situations; again, good book-keeping is essential to keep track of clone choice, procurement, verification and of expression data. Ready-made membranes provided by several suppliers (http://www.clontech.com/clontech/Catalog/Hybridization/Atlas.html , http://www.genomesystems.com/GDA/ and http://www.resgen.com/ ) as well as by resource centres are also used in some projects and generate a need for data archiving.
To be really useful in the context of a biological laboratory, a notebook system must be extremely user-friendly: it should be used daily by each member of the group, and the interface must be designed with this in mind. It should run well on affordable machines that the prospective users are familiar with, i.e. in most cases on PC or Macintosh microcomputers. The system must be extremely flexible and allow additions and changes to be made without loss of previously entered data, to accommodate easily new experimental approaches or new ways of analysing existing information; data security and access privileges should also be well organised.
We have used the 4th Dimension (ACI) relational database management system (http://www.aci-4D.com/ ) to develop a laboratory database, LABNOTE, aiming to fulfil this need. 4th Dimension (4D) has been used previously for biological databases (9,10), for a large number of (unpublished) medical databases, and for at least one genome mapping database (3). Although some aspects of our implementation are tailored to the specific needs of our project, LABNOTE has proven readily adaptable to other laboratories and we believe it has general applicability in groups performing various genomic and expression studies.
SOFTWARE AND DATABASE
Database management system
Labnote was developed in the ACI 4th Dimension Relational DataBase Management System (RDBMS) (http://www.aci-4D.com/ ). Dr G. A. Evans provided us with a version of the Genome Notebook (3) constructed in the same system; while this is geared to a different aim (data handling for a large-scale genome mapping project), it was very helpful in terms of defining data architecture and important relationships. 4th Dimension (4D) is an RDBMS whose relational technology allows modular application development and good control of the appearance of corresponding views. Its graphic module allows the definition of database structure by drawing entities and links. The 4D RDBMS is platform-independent. Applications developed in e.g. Windows 95, Windows NT, Mac OS or Power PC can be deployed, without changes, on all the other platforms. The request language used is adapted to 4th Dimension. Referential integrity is automatically implemented (http://www.aci-4D.com/ ).
For the complete development of LABNOTE, we used various procedural packs (e.g. ACI-Pack v1.9.2c, Button Package v3.0.2, File Pack v3.0.5), and a number of development tools (e.g. 4D Insider, 4D Transporter, External Mover). LABNOTE was originally developed in ACI 4th Dimension v5.0.2; it presently uses 4D Server v1.5.4, the multi-user version of 4th Dimension. The unified client/server architecture optimises database performance and provides a transparent interface in a heterogeneous hardware environment (PC or Macintosh Client). 4D Server sends each client the requested data in the format adapted to its deployment platform. 4D makes available a number of functionalities that are very useful in an experimental approach such as ours: Format List (allowing work on sets of clones), Enumerations and Pull-down menus (allowing guided data entry with choice limited to a subset of a previously defined list). In addition, various formats of Import-Export are available such as 4D Format, to save sub-selection of clones or hybridisation results etc., or Text Format for the creation of files compatible with a robot used for physical rearraying of selected clones.
Implementation
A general outlook indicating the kind of data stored, the links used and the structural organisation of LABNOTE is given in Figure
Figure 1. General structure of the LABNOTE database. The four main tables around which the library is organised are Library, Clones, Experiments and Sequences, each with a number of fields and a few small subtables (thin black links). Active links between the files are indicated in blue. Experiments are an open and evolutive category; they are defined by type and IdExp and are accessed ad hoc without automatic links (see the three types of experiments displayed, i.e. PCR, Hybrid HD and BlotFilter). This makes it possible to add further kinds of experiments without modifying the database structure. We primarily use LABNOTE via its main menu toolbar, that has been divided in three submenus: Tools, Experiments and Results (Figure Figure 2. Submenus, directly accessible items and links between them. The Tools menu leads to all information concerning cDNA libraries (Library), High Density filters (HD Filters) prepared from these, experimental protocols (Protocol) used for all experiments stored in LABNOTE, and some bibliographic references (Biblio Reference). The Experiments menu leads to results from High Density membrane hybridisations (HD hybridisations), northern blots (Northern Blots) and Polymerase Chain Reaction (PCR). For these experiments, some experimental conditions, an author name, a creation date, results and free comments are stored. Results Menu: in LABNOTE all information concerning any clone (Clones) and any sequence (Sequences) is stored as results. Capture of these results is limited and modifications can only be done with specific authorisation. The two tables harboring the most significant data (in terms of quantity and of relevance to the principle of analysis and synthesis of the results) are tables Hybrid HD and Clones. Hybrid HD contains in particular: the name of the filter hybridised (FilterName), the name of the probe used (Probe), the name of the image filter (FileImage), the name of the author of the experiment (Author) and the result of the quantification (Quantification): more than 20 000 currently. To highlight the main features of this database, we present here some examples of its use: Figure 3. Example of information accessible from a clone name. Top: Clone view, accessed e.g. by giving the clone name (MTA.F02.091) in the Clone List accessible from the Results submenu, and a few of the views that can be accessed from there: hybridisation experiment (middle left), northern image (bottom left, quantified data is also stored as well as the makeup of the northern blot), interpreted sequence summary (middle right), actual sequence and comparison results (bottom right). More information is accessible from the Clone view e.g. rearrayed sets (Plate), PCR data, library of origin etc. One sequence and its functionalities (not shown). After (partial or complete) sequencing of a clone, its sequence is stored in LABNOTE; vector sequences can be removed, and it can be used for comparison using Internet tools. Comparison results can be directly stored in LABNOTE.
Description of database features
Examples of data access procedures
Figure 4. Extraction of multiple hybridisation data for a defined set of clones. A list of clones is imported into LABNOTE or defined within the Clone List using the sorting tools available (tool bar). The Prob. button calls up a list of probes that have seen any of these clones. The user indicates which probes are of interest, and an EXCEL table giving the data is generated (a value of 1 corresponds to an mRNA abundance of 0.1% after correction and normalisation; 7). Blanks correspond to absence of data, i.e. clones that have not been hybridised with this particular probe. The definition of the basic database structure for LABNOTE was a long, interactive and iterative process, beginning from the start of the experimental project. At the outset, the members of the group had little idea of what a laboratory database could achieve and how it should be organised; the example of the Genome Notebook (3) was very useful to give a feeling for what was possible. After a number of meetings and discussions with the developer (who was not familiar with our experimental approaches), a first working database was constructed and tested; successive versions were produced until a reasonably adapted-but still evolving-system was in operation. LABNOTE has proven absolutely essential to our research. The attachment to any clone of virtually all the information ever obtained is extremely useful, has avoided many unnecessary experiments and helped tremendously to achieve close collaboration between several groups that are interested in different aspects of thymus function but use the same technology and the same libraries to find new genes relevant to their interests. The flexibility of the system has been amply demonstrated as new features demanded by the users were incorporated into successive versions without loss of previously entered data. Users can provide comments on their experiments or analyses, and data is available for reanalysis at any subsequent time in the light of newly acquired information. In our operation, expression data is first acquired as text files generated by image analysis of phosphor plate data. These results are then imported into EXCEL spreadsheets where a number of correction and normalisation procedures are carried out by macrocommands. The verified expression data is then transferred to the database, from which it can be later exported for specific subsets of clones as indicated in the examples (Fig. The system as described in this paper handles many different kinds of information, and can be used with little adaptation in different projects; variations on this theme can be produced fairly easily thanks to the powerful tools provided by the 4th Dimension system, although this requires additional software and some programming expertise. Hard disk requirements are modest-the whole database structure occupies 1.5 Mb, and all our present data (including more than 43 000 clones and 10 000 tag sequences), 43 Mb. The new version 6 of 4D provides Web support; this will allow any Web navigator to interact with the database. In conclusion, we believe that this system represents a very practical approach to better laboratory management in an academic environment. The current version of LABNOTE is freely available to academic users; the only necessary commercial software is the relatively inexpensive 4D runtime package. It can be provided in single-user version for PC or Macintosh machines; a Windows NT version is also available. Adaptation to the specific needs of a given laboratory may require programming additions or changes that we cannot undertake to perform. Please contact jordan{at}ciml.univ-mrs.fr for details. The Resource Centre of the British Human Genome Mapping Programme will shortly make a version of this system available to its users. We thank Dr G. Evans for communication of the (then unpublished) Genome Notebook database (3). This work was supported by institutional grants from INSERM and CNRS to our Institute, as well as by specific grants to the TAGC group from the French Muscular Dystrophy Foundation (AFM) and GREG (Groupement de Recherches et d'Etudes sur les Génomes).
DISCUSSION
AVAILABILITY OF SOFTWARE
ACKNOWLEDGEMENTS
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 23 Dec 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This Article ![]()
![]()
Abstract
![]()
Print PDF (803K)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (10)
![]()
Request Permissions ![]()
Commercial Re-use Guidelines
for Open Access NAR Content
![]()
Google Scholar ![]()
![]()
Articles by Imbert, M. C.
![]()
Articles by Jordan, B. R.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Imbert, M. C.
![]()
Articles by Jordan, B. R.
![]()
Social Bookmarking ![]()
![]()
What's this?