ABSTRACT
We have created databases and software applications for the analysis of DNA
mutations in the human
p53
gene, the human
hprt
gene and the rodent transgenic
lacZ
locus. The databases themselves are stand-alone dBase files and the software for analysis of the databases runs on
IBM-compatible computers. The software created for these databases permits
filtering, ordering, report generation and display of information in the
database. In addition, a significant number of routines have been developed for
the analysis of single base substitutions. One method of obtaining the
databases and software is via the World Wide Web (WWW). Open home page
http://sunsite.unc.edu/dnam/mainpage.html with a WWW browser. Alternatively,
the databases and programs are available via public ftp from
anonymous{at}sunsite.unc.edu. There is no password required to enter the system.
The databases and software are found in subdirectory pub/academic/biology/dna-mutations. Two other programs are available at the WWW site, a program for
comparison of mutational spectra and a program for entry of mutational data
into a relational database.
We have created databases and software applications for the analysis of DNA
mutations at several loci. This very brief manuscript describes databases and
software for analysis of mutations in the human
p53
gene, the human
hprt
gene and the transgenic
lacZ
locus.
Mutations in one of the loci, the
p53
gene, are found with high frequency in a wide variety of human cancers. It is
estimated that perhaps 50% of all human cancers contain a mutation in the
p53
oncogene (
1
,
2
).
A gene of considerable interest to genetic toxicologists is the hypoxanthine-guanine phosphoribosyltransferase (
hprt
) gene, which codes for an enzyme that functions in the purine salvage pathway.
Cells bearing a mutation in the
hprt
gene can be selected and cloned from tissue culture experiments and from T
cells isolated from rodents (
3
), primates (
4
) and humans (
5
,
6
). Thus somatic mutations arising
in vivo
in humans can be studied.
The development of transgenic rodents for the study of mutation is relatively
new. These systems typically employ a transgenic [lambda] phage shuttle vector and use the
lacI
(
7
) or
lacZ
(
8
) genes as mutational targets. These systems permit the analysis of mutations
generated
in vivo
in a variety of tissues.
In order to facilitate the analysis of mutations we have developed databases
containing DNA sequence information about the
hprt
,
p53
and
lacZ
genes and a software package that performs statistical analysis of the
information in each database.
Each database is in the dBase format and is present as a stand-alone file. Information common to all databases includes: (i) base
position; (ii) the nature of the mutation; (iii) amino acid position; (iv) wild-type and mutant amino acid sequence; (v) the local sequence around a
mutation; (vi) literature citations.
Information specific to the
p53
database includes: (i) cancer type; (ii) cell origin (tumor, cell line, etc.);
(iii) loss of heterozygosity. Data particular to the
hprt
database includes: (i) mutagen; (ii) dose; (iii) background and induced
mutation frequencies; (iv) whether the mutant was generated
in vivo
or
in vitro
; (v) mRNA splicing information for mutants affecting splicing; (vi) cell type.
Data contained in the
lacZ
transgenic database includes: (i) dose; (ii) time from last treatment to animal
sacrifice; (iii) supplier, species, strain, sex and age of animal; (iv) the
organs selected for mutation analysis; (v) the mutant fraction in each organ;
(vi) total p.f.u. analyzed; (vii) plaque color.
The
hprt
and
p53
databases and software have been described previously (
9
,
10
). A manuscript regarding the
lacZ
database and software has been submitted to
Environmental and Molecular Mutagenesis
.
A separate software package exists for each database. The software runs on IBM-compatible PCs only. The
lacZ
and
hprt
programs run under Microsoft Windows, while the
p53
program is an MS-DOS application. All software packages permit filtering, ordering, report
generation and display of information in the database.
A significant number of routines have been developed for the analysis of single
base substitutions, including programs to: (i) determine if two mutational
spectra are different; (ii) display mutable amino acids in the protein; (iii)
determine if mutations show a DNA strand bias; (iv) determine the frequency of
transitions and transversions; (v) display the number and kind of mutations
observed at each base in the coding region; (vi) perform nearest neighbor
analysis. For genes with exons a routine will display the number of mutations
and mutable sites in each exon. Graphics displays are available for mutated
amino acids and for mutational spectra representation.
The databases and software for the
p53
and
lacZ
gene are freely available. The
hprt
database and software is available on a subscription basis, however, a version
of the database and software is available for evaluation.
One method of obtaining the databases and software is via the World Wide Web
(WWW). Open home page http://sunsite.unc.edu/dnam/mainpage.html with a Web browser:
Alternatively, the databases and programs are available via public ftp from
anonymous@sunsite.unc.edu. There is no password required to enter the system.
The databases and software are found in the subdirectory
pub/academic/biology/dna-mutations.
The ftp server is very popular and users may not be able to get in to the system
during peak hours.
Information about all databases and instructions for downloading are present
when using either WWW or ftp access. All files must be transferred as binary
files.
Two other programs are available at the site, a program for comparison of
mutational spectra (
11
) and a program for entry of mutational data into a relational database (
12
). The mutational spectra program is a stand-alone DOS executable and the relational database program requires
Microsoft Access 2.0 to run and modify the program.
The present article is an extension of the work presented in the previous
Nucleic Acids Research
database issue (
13
,
14
).
REFERENCES
Return