| Nucleic Acids Research | Pages |
The Radiation Hybrid Database
Introduction The Radiation Hybrid Database
Information technology
The data
Data submission
Data access
Data query/retrieval
World Wide Web
CORBA
Future Developments
How To Contact Rhdb At The European Bioinformatics Institute
Acknowledgement
References
The Radiation Hybrid Database
ABSTRACT
INTRODUCTION
The radiation hybrid mapping technique (1,2) is a method for ordering markers along a chromosome, and gives estimates of physical distances between them. Radiation hybrids are produced by fusing irradiated donor cells with recipient rodent cells. These hybrid cell lines are grouped in so-called panels of clones, each containing different sets of chromosome fragments produced by radiation-induced breakage. The clones are screened by PCR amplification (producing `scoring data') to establish the presence or absence of a given marker. Nearby loci will tend to show similar retention patterns, the so-called score vectors. Using these results the proximity can be calculated based on a statistical model. The quality measure of marker positions is expressed as a likelihood or LOD score (3,4).
Radiation hybrid methods can be used to map non-polymorphic markers such as sequence tagged sites (STS). Expressed sequence tags (ESTs) are particularly attractive in this respect and are used frequently.
THE RADIATION HYBRID DATABASE
RHdb is a repository of raw data relevant to radiation hybrid mapping. It was set up in 1995 to support a group of European and US genome mapping laboratories. RHdb stores data on panels, experimental conditions, STSs and experimental results of assays. Maps are kept too, as well as authoring and bibliographic information. Extensive cross-referencing to other databases is another important aspect of RHdb. The database is species independent, but contains currently only human data.
The Radiation Hybrid database can be accessed on the World Wide Web at http://www.ebi.ac.uk/RHdb/ (Fig. 1). This page provides information about the database and access to reports and query tools. The first release (July 1995) contained 1115 assay entries; release 6 (August 1996) contained 28516 entries, while the most recent release (release 9, August 1997) contains 62953 entries, showing that growth is substantial.
Figure
RHdb is stored and maintained in the relational database management system (RDBMS) ORACLE. Using RHdb as a test-bed, we are also evaluating other information technologies, such as object-relational (Illustra, Informix) and object-oriented (EyeDB) databases, and more recently, CORBA (see below). It is important to have a correct data model of the data at hand. The model should be rich, yet simple. It makes the understanding, the querying and the maintenance easier. We have used the object oriented design tool Rational Rose to model our data. This model was subsequently implemented on a variety of substrates. The model is described at http://www.ebi.ac.uk/RHdb/SCHEMA/RHdb_object.html
The primary type of data are the assay entries: the scoring results of the PCR amplification of a particular STS (with given primers) on a particular panel by a particular laboratory. Assay entries are given an accession number of the form RHn, which is a permanent unique identifier. There are three other primary types of entries in the database: (i) Panels, with information about the authors, distributors and the clones. The panel name serves as the `accession number'. (ii) Maps, with the order, position and LOD scores of assays. They have an accession number of the form Cmn. (iii) PCR experimental conditions in free text format. Accession numbers are of the form EIn. An RH entry also includes other types of information: (i) author/laboratory identification (of form Cin); (ii) bibliographic information (of form PIn), if available; (iii) a `flag' describing the type of STS used (EST, genetic marker, CpG island, etc. ...). These classifications of STSs enable the query of subsets of the data. Cross-references to other databases are systematically added when they are given by the submitter, or when they can be inferred. The are now >170 000 cross-references, to the following databases: ATCC; CGM-WUSM; CHLC; GDB; Généthon; Genexpress; IMAGE; KDRI; NHGRI; PAGE; Rhalloc; RHdb; SALK; SangerSTS; TIGR; UCHSC; UT; WICGR; WTCHG; dbEST; dbSTS. Details and statistics concerning the data with regard to marker types, chromosomes and database cross-references can be found at http://www.ebi.ac.uk/RHdb/STATS/rhdb_stat.html
The primary intent of RHdb is as a public repository of data relevant to the (re)construction of maps. Since experimental results often come in large batches, the submission process is largely automatic and able to handle large quantities of data. For flat file submissions, a tagged-field format is used. A full description of the formats can be found at http://www.ebi.ac.uk/RHdb/rh_formats.html . A syntax verification program for this format is provided on the EBI anonymous FTP server at ftp://ftp.ebi.ac.uk/databases/RHdb/softs/rh_submit.tar.gz Entries should be submitted by e-mail to rhdb@ebi.ac.uk After syntax checking, the data is subjected to additional tests: (i) the number of scores is ascertained to be equal to the size of the panel; (ii) species information is verified with that in the EMBL/GenBank/DDBJ sequence database (5,6) by using the mandatory cross-references to this database; (iii) similarly, the primers are checked with the actual sequence; (iv) cross-references between RHdb entries are added.
All data types (RH assays, experimental conditions, panels and maps) can be retrieved as ASCII files in a tagged field format described at http://www.ebi.ac.uk/RHdb/rh_formats.html . The files are available on the EBI anonymous FTP server at ftp://ftp.ebi.ac.uk/databases/RHdb with the filenames. Incremental updates are made available in the same directory.
Figure
The flat file data can be queried through Etzold's SRS (8) at the URL: http://www.ebi.ac.uk/srs/srsc/ . In addition, a Java applet accessing a prototype CORBA server for database queries has been implemented. It can be found at http://sunny.ebi.ac.uk/EBI/RHdb/ProtoClient/
Figure
Reports are automatically generated every night. They can be accessed through the EBI WWW server at http://www.ebi.ac.uk/RHdb/rh_reports.html
Biological databases are traditionally distributed through flat files. They have a number of deficiencies: (i) parsing can be non-trivial; (ii) formats are often ad hoc; (iii) data is not `live'; (iv) data can be redundant; (v) data model is often poor, and does not have `behaviour'; (vi) access and querying are often difficult. In view of this, and with an eye to the ever-increasing amount and complexity of biological data, the EBI has adopted CORBA as their future medium for the serving of data. CORBA is the Common Object Request Broker Architecture. It was developed by a large consortium of software industries as a standard for distributed object-oriented computing (8). The central elements of a CORBA service are the server (which implements the functionality) and the ORB (Object Request Broker; it provides the communication). Together, they serve data and execute operations (e.g., retrieving data from a database) in response to requests made by client programs. The facilities offered by a CORBA server are published in its specification, which is written in IDL (Interface Definition Language). The server implements the objects, which can be used by client programs in any way desired. A fragment of the IDL specification that is implemented to serve RHdb is shown in Figure 2. For example, the query method getMapList( ) is an operation of the class (interface) RHMap. The parameters are input parameters only (IDL: in), and are used to specify queries. The method's return type is a list (IDL: sequence) of Maps, defined earlier in the fragment. In addition, getMapList( ) can raise run-time exceptions of the type RHException (not shown).
A client Java applet that uses this server is shown in Figure 3. It can be found at http://sunny.ebi.ac.uk/EBI/RHdb/ProtoClient/ . Detailed information on the server can be had from http://www.ebi.ac.uk/RHdb/CORBA/Proto/

Information technology
The data
Data submission
Data access

Data query/retrieval

World Wide Web
CORBA
FUTURE DEVELOPMENTS
CORBA is starting to deliver machine-independent, language-independent, Internet-accessible objects. Although a number of database specific issues have to be resolved, the prospects of this medium are very good.
An exciting application is the transparent integration of heterogeneous, distributed databases. By using CORBA's class inheritance, the aspects of the data that are common to the databases can be abstracted into a superclass, which must be implemented by all databases. The database-specific details can be implemented in subclasses specific to the particular database (typically, the local database).
We are currently investigating this approach together with Infobiogen in France (10). The end goal is transparent database access to chromosome maps, be they radiation hybrid maps, genetic, cytogenetic or physical maps.
HOW TO CONTACT Rhdb AT THE EUROPEAN BIOINFORMATICS INSTITUTE
| Internet: | home page: http://www.ebi.ac.uk/RHdb |
| FTP server: | ftp://ftp.ebi.ac.uk/pub/databases/RHdb |
| e-mail: | rhdb@ebi.ac.uk (enquiries and submissions) |
| Postal address: | Rhdb, EMBL Outstation-the EBI, The Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK |
| Tel: | +44 (1223) 494401 |
| Fax: | +44 (1223) 494468 |
ACKNOWLEDGEMENT
We would like to thank P. Deloukas (Sanger Centre) for his continuing support.
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals Comments and feedback: www-admin{at}oup.co.uk
Last modification: 17 Dec 1997
Copyright© Oxford University Press, 1998.
This article has been cited by other articles:
![]() |
N. Jareborg and R. Durbin Alfresco---A Workbench for Comparative Genomic Sequence Analysis Genome Res., August 1, 2000; 10(8): 1148 - 1157. [Abstract] [Full Text] |
||||
![]() |
P. S. White, E. P. Sulman, C. J. Porter, and T. C. Matise A Comprehensive View of Human Chromosome 1 Genome Res., October 1, 1999; 9(10): 978 - 988. [Abstract] [Full Text] |
||||
![]() |
J. D. Parsons, E. Buehler, and L. Hillier DNA Sequence Chromatogram Browsing Using JAVA and CORBA Genome Res., March 1, 1999; 9(3): 277 - 281. [Abstract] [Full Text] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
