Nucleic Acids Research, 2003, Vol. 31, No. 13 3510-3517
© 2003 Oxford University Press
Theatre: a software tool for detailed comparative analysis and visualization of genomic sequence
Yvonne J. K. Edwards*,
Tim J. Carver,
Tanya Vavouri,
Martin Frith,
Martin J. Bishop and
Greg Elgar
Comparative Genomics Group, Research Division, MRC UK Human Genome Mapping Project Resource Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SB, UK
*To whom correspondence should be addressed. Tel: +44 1223494531; Fax: +44 1223494512; Email: yjedward{at}hgmp.mrc.ac.uk
Received November 21, 2002; Revised January 17, 2003. Accepted January 27, 2003
 |
ABSTRACT
|
|---|
Theatre is a web-based computing system designed for the comparative
analysis of genomic sequences, especially with respect to motifs
likely to be involved in the regulation of gene expression.
Theatre is an interface to commonly used sequence analysis tools
and biological sequence databases to determine or predict the
positions of coding regions, repetitive sequences and transcription
factor binding sites in families of DNA sequences. The information
is displayed in a manner that can be easily understood and can
reveal patterns that might not otherwise have been noticed.
In addition to web-based output, Theatre can produce publication
quality colour hardcopies showing predicted features in aligned
genomic sequences. A case study using the p53 promoter region
of four mammalian species and two fish species is described.
Unlike the mammalian sequences the promoter regions in fish
have not been previously predicted or characterized and we report
the differences in the p53 promoter region of four mammals and
that predicted for two fish species. Theatre can be accessed
at
http://www.hgmp.mrc.ac.uk/Registered/Webapp/theatre/.
 |
INTRODUCTION
|
|---|
Databases comprising information on transcriptional regulation,
such as TRANSFAC (
1), the Eukaryotic Promoter Database (
2),
COMPEL (
3) and TRRD (
4) are valuable in classifying transcription
factors and their DNA binding sites. Software developed to predict
protein binding sites in sequences like MatInspector (
5) and
Tfscan from the EMBOSS package (
6) rely on databases such as
TRANSFAC for protein binding matrices and strings to search
for motifs in new sequences. It is helpful to identify putative
regulatory elements in the context of a defined structure or
a promoter region of a gene whose expression is being affected
(
1
4). Comparative analysis of orthologous promoters and
other types of non-coding regions are proving to be a reliable
guide to finding conserved features to be tested experimentally
for function (
7
11). Identifying conserved non-coding
sequences in genomes is an extremely useful starting point towards
studying the control of expression of orthologous and paralogous
genes. Comparative analysis of orthologous genomic systems can
help to identify motifs or other features that may affect expression,
development and differentiation shared by organisms. These are
a few of the considerations that led to the development of Theatre.
Theatre has been designed to compare and display features in equivalent genome sequences. The features considered are the coding and non-coding regions, repetitive sequences, transcription factor binding sites, intron and exon sizes and nucleotide biases. The first section of this article describes the Theatre analysis tool to compare features in genomic sequences. The second part describes an application using the p53 promoter.
 |
BIOLOGICAL BACKGROUND RELEVANT TO THEATRE
|
|---|
Theatre has been designed to compare equivalent gene structures
and analyse likely genomic regions responsible for regulation
of gene expression. Theatre integrates commonly used tools to
characterize or predict the position and orientation of the
protein coding regions, repetitive DNA sequences, and transcription
factor binding sites. This section summarizes the sequence analysis
tools and databases used by Theatre.
Transcription factor binding sites and other regulatory regions
Known transcription factor binding motifs can be searched for in DNA sequences using MatInspector (5) or Tfscan [a component of EMBOSS (6)]. MatInspector version 2.0 performs matrix searches in the nucleotide sequences. Analysis for 246 protein binding sites can be made. Tfscan uses string searches from databases of compiled transcription factor binding sites to find transcription factor binding sites. Cpgplot, also part of EMBOSS, identifies CpG islands. CpG islands are patches of non-methylated DNA coinciding with most gene promoters in the genomes of vertebrates; methylation of these motifs correlate with repression of transcription (12). The CpG islands are defined as regions longer than 200 bases with a moving average of %(G+C) in excess of 50% and a moving average of observed/expected CpG dinucleotide content >0.6.
Protein coding regions
DNA sequence BLAST searches (13) against the SPTR (SWISS-PROT and TrEMBL) databases (14) and the GeneMark ORF predictions (15) can be used to locate potential protein coding regions. These assignments, when available, are useful for determining the positions of protein coding regions.
Repetitive DNA sequences
RepeatMasker performs a fast sequence search against databases of repeat sequences such as SINES, LINES, LTR and other retrotransposable elements commonly found in genomic sequences (A.F.A. Smit and P. Green, unpublished material; http://ftp.genome.washington.edu/RM/RepeatMasker.html). Repetitive elements in the sequences are masked using RepeatMasker. The masked sequences are then searched against the SPTR database using BLAST. Any repeats identified are highlighted.
 |
DESCRIPTION OF THEATRE AND THE WEB INTERFACE
|
|---|
There are three stages to running Theatre: creating a multiple
sequence alignment, populating hierarchical directory structures
of genomic features predicted for each sequence and generating
a graphical display that superimposes features on the sequence
alignment. The first stage (PreMSD) generates a multiple DNA
sequence alignment using ClustalW (
16). The second stage consists
of a Theatre program developed in C, named MSD. The role of
MSD is to submit UNIX shell scripts to a job queue. The UNIX
shell scripts are generated from a template, each of which controls
the input and output details and parameters to run each of the
programs of the second stage (Table
1). MSD generates a flat
file databank of genomic features for the sequences using the
UNIX shell scripts. This stage creates a project directory containing
a named subdirectory for each program. For example, a subdirectory
named MatInspector holds the MatInspector results for the sequences
included in the analysis. Each subdirectory contains the output
from one of the programs and is searched by the programs of
the third stage. The third stage consists of a program that
formats the features and highlights the relationships among
features displayed in equivalent regions. This display program
generates the colour PostScript
TM output files. Tables
2
6 and Figures
1
4 are sample output from the Theatre display
program. The programs of the Theatre package (PreMSD, MSD and
the display program) can be run from the UNIX command line.
They are integrated into a web-based interface using PERL (
17)
and the CGI.pm library (
18).
345
Online documentation is provided from the Theatre website. The
user interacts with Theatre through a set of HTML forms. The
form contains fields to be completed for the user's email address,
the GeneMark matrices, a project name and the individual sequence
filenames. Input requirements are genomic sequences in a variety
of formats. Although Theatre can use ClustalW to align sequences,
it is possible to enter an alignment in MSF created from another
source (
19,
20). The user is sent an electronic mail message
when the analysis is complete and PostScript
TM generation is
completed interactively in the last stage. Table
1 lists more
details regarding the use of Theatre.
 |
THEATRE IMPLEMENTATION
|
|---|
The Theatre and MSD program are written in ANSI C. The programs
compile and run on Solaris platforms. Control of the programs
can be made through the use of UNIX command line arguments.
Six programs are used for identifying regions of interest in
genomic sequences (Table
1). The programs that make up stage
2 of Theatre have some limitations on the amount of data that
they can process, so Theatre requires that no more than 10 kb
in total are provided. Within this overall limit any number
of sequences can be input. The maximum number of sequences and
sequence lengths that can be used by each is dependent on the
available memory and known caveats set by the individual programs.
The programs ClustalW, RepeatMasker, GeneMark, BLAST, MatInspector,
Tfscan and Cpgplot are available from their respective authors.
Executables for Theatre can be made available on request but
not the dependent modules or its interface. Theatre can be used
by registered members of the Bioinformatics facilities of the
Human Genome Mapping Project Resource Centre (HGMP-RC) at the
following URL
http://www.hgmp.mrc.ac.uk/Registered/Webapp/theatre/.
The Common Gateway Interface (CGI) scripts developed for the Theatre server are written in PERL (17). The server runs on the Solaris operating system. UNIX shell scripts drive the selected analysis programs (Table 1), jobs are created and submitted to an in-house batch queuing system. This system distributes the processing over a server farm running the Solaris 8 operating system.
 |
THEATRE: AN ILLUSTRATED CASE STUDY
|
|---|
Theatre produces two types of graphical output, a concise graphical
display (Figs
1 and
3) and some detailed output (Figs
2 and
4). The concise display covers a single A4 page, while the detailed
display shows the individual nucleotides and may run over many
pages, depending on the sequence alignment lengths. A case study
is presented describing the promoter region of the p53 gene
in mammalian and puffer fish species. The p53 gene is a tumour
suppressor that has a fundamental role in cell cycle control
and division. The functions of transcription factors that bind
to motifs in the p53 promoter and regulate the expression of
the p53 gene have been experimentally identified and characterized
in mouse (
9), human colon cancer cells (
22), rat (
23) and the
golden hamster (
24). The p53 gene was identified in the genomic
sequence from
Fugu rubripes (Fugu) (
25,
26) and
Tetraodon nigroviridis (Tetraodon) (
27).
Figure
1 shows the Theatre concise alignment of the p53 promoters
from human, rat, mouse and the golden hamster. The regions comprise
the first exon (non-coding) and the immediate 5' flanking DNA.
Each sequence is between 426 and 700 bases containing an upstream
non-transcribed sequence ranging from 207 to 478 bases. The
transcription start site of the p53 gene for rat, mouse and
human (
23) is at position 508 of the multiple sequence alignment
(Fig.
2A). The EMBL accession numbers provided to Theatre as
input can be found in the legend to Figure
1. The alignment
was performed using ClustalW and the default parameters. The
selected MatInspector binding sites are those that we expected
to see in the p53 promoter (
9,
22
24). The GeneMark predictions
of the open reading frames (ORFs) and SPTR BLAST matches were
switched off, as the region examined is known not to include
protein coding sequences. In the golden hamster promoter, a
repetitive element was identified by the RepeatMasker program
and highlighted in the graphical display using Theatre. No CpG
islands were predicted in any of the four sequences. Additionally,
there were no conserved CAAT and TATA box motifs predicted upstream
of the transcription initiation site (Figs
1 and
2), as expected
from previous studies (
9,
22
24). In this respect, MatInspector
performed well. The threshold for consensus display is set to
75% to establish highly conserved and invariant features. The
four sequences have highly conserved features illustrated in
the consensus (Fig.
1A) and the detailed alignment (Fig.
2A).
The transcriptional regulation of the murine p53 has been well
studied and localized in the region surrounding the transcription
initiation site (
9,
22
24). The location and order of seven
invariant transcription factor binding sites are listed. Many
of these correspond to the experimentally verified downstream
sites. The sites are listed together with their frequency and
the site name in parentheses: NF1 (two, V$NF1_Q6); NFkB (two,
V$NFKAPPAB_01); Sp1 (one, V$SP1_Q6); ETF (one, MOUSE$P53_07);
c-myc/max (one, V$MYCMAX_02); USF (two, V$USF_C) and USF (two,
V$USF_02). The c-myc/max proteins belong to the basic helixloophelix
family of transcription factors and bind to a certain class
of E-box that shares a signature motif that consists of the
core hexanucleotide sequence CANNTG. The USF and NFkB sites,
to a large degree, are palindromes identified as pairs of binding
sites in the same location that read in the forward and reverse
orientation. The upstream ETF binding site is present and conserved
in all sequences, however, the binding sites do not exactly
align and the site does not appear in the consensus. The human,
mouse and rat sequences have a conserved upstream PF1 binding
site (MOUSE$P53_03) not shared in the hamster sequence. The
hamster, mouse and rat sequences have a conserved downstream
P300_01 (P300_01) not found in the human sequence (Fig.
1A).
Similarly, the upstream Sp1 motif is present in all except rat.
Such patterns of transcription factor binding sites predicted
are highlighted and in this case, they correspond well with
experimentally defined protein binding sites (
9,
22
24).
Theatre is used to display the binding sites found in the mammalian sequences in sequences extracted from the Fugu and the Tetraodon genome (Fig. 3). Unlike the mammalian sequences, the positions of the p53 promoter sequence have not been previously predicted, identified or characterized. The draft genome sequence of Fugu and Tetraodon (http://fugu.hgmp.mrc.ac.uk/ and http://www.genoscope.cns.fr/externe/tetraodon/, respectively) were searched for contigs that contained the p53 gene. Two puffer fish p53 cDNA sequences, retrieved from public databases (SPTR: Q9W679 and EMBL: BU806111
[GenBank]
), were used to predict the transcription start site of the Fugu and Tetraodon p53 gene and the position of coding sequence (Figs 3 and 4). The p53 cDNA sequences were blasted using BLASTN and TBLASTX against the publicly available Tetraodon genome assemblies. The p53 promoter regions were predicted from Fugu in Scaffold_126 (release 2: EMBL: CAAB01000126) (26) and Scaffold_18 (release 3) and Tetraodon from FS_CONTIG_1412_2 (release 6). Regions likely to be involved in regulating expression of the p53 gene were extracted from the genomic sequence using extractseq program from EMBOSS (6) and the predicted promoter regions used as input for analysis using the programs in Theatre (see legend to Fig. 3). We compare patterns in binding sites predicted in the mammalian (Figs 1 and 2) with the two fish p53 promoters (Figs 3 and 4). All the genes possess a non-coding exon comprising exclusively 5' untranslated sequences, in this respect the fish gene structures considered here are similar to the mammalian promoters. There is a conserved CpG island predicted in the Fugu and Tetraodon promoters but absent in the mammalian promoter. Conserved within the puffer fish p53 promoters are four ETF signals (MOUSE$P53_07) predicted by Tfscan. The MatInspector sites conserved at the promoter region include the following sites where the number of the binding sites is in parentheses: NF1 (two), CAAT (one) and NFkB (one). The sequences from the two taxonomic classes are displayed separately mainly because the order and frequency of occurrence of sites differ significantly between the mammalian and puffer fish p53 promoter. The promoter region of the puffer fish possess a conserved CAAT and CpG island in the promoter suggesting differences in gene regulation of the p53 gene compared to the mammalian species. The puffer fish p53 promoters share some similarity with the mammalian promoters in having conserved NF1, NFkB and ETF sites, however, the location of the conserved NF1 and NFkB sites are in a different order. NF1 is critical for basal expression of the mouse p53 gene whilst the NFkB recognition site has been shown to be required for trans-activation of the murine p53 promoter in the presence of NFkB (1,4,9). The USF and c-Myc/Max factor are conserved in mammals but not in the fish sequences. Both these factors bind E-box sites and are implicated in increasing transcription of p53. The differences suggest that puffer fish have a different mechanism for control of p53 gene expression compared with mammals. There is also an island of predicted and conserved motifs in intron 2. The conserved motifs are a useful guide to targeting regions of interest that can be tested for function using experimental techniques. The nucleotide composition, nucleotide sequence biases and ratios of the observed and expected for the 16 dinucleotides for the six species are also given (Tables 26).
 |
DISCUSSION
|
|---|
The preparation of quality figures to highlight annotation in
aligned sequences can prove time-consuming and can involve using
graphical drawing packages and searching through the output
of analysis programs (
28). Programs offering related functions
such as comparative genomic sequence analysis and visualization
include CINEMA (
29), Alfresco (
30), VISTA (
31), PipMaker (
32)
and SynPlot (
33). CINEMA (Colour INteractive Editor for Multiple
Alignment) is a Java based tool for manipulating and generating
aligned nucleotides and amino acids (
29). Alfresco is a visualization
tool developed in Java to allow comparative genome sequence
analysis between two sequences (
30). VISTA is software for visualizing
global DNA sequence alignments of arbitrary length (
31). PipMaker
compares two long DNA sequences to identify conserved segments
(
32). SynPlot (
33) is a tool similar in philosophy to PipMaker
and VISTA that automates the graphical display of large-scale
sequence alignments. These tools are widely used by the scientific
community. They hold much promise for comparative genomics and
in the search for conserved non-coding sequences to test for
putative enhancer and/or promoter activity (
7
11,
33).
These tools are, however, best suited when there is a large
degree of conservation of synteny between similar vertebrate
genomes (e.g. mouse and human). The gene order (
26,
33
35)
and the features in the promoter regions (
36,
37), are not always
well conserved in the genomes of species from highly diverged
taxonomical classes. This is notable between Fugu and human
genomes where short sections of two to three adjacent genes
are conserved in both species. If the gene orders differ in
equivalent regions, the usefulness of these tools, to compare
large genomic regions between species that are evolutionary
divergent, will be reduced.
Theatre is intended to study genomic regions on a small to medium scale (i.e. 500 bp10 kb) in detail and is well-suited for studying transcription factor binding sites in equivalent promoters, introns and other gene structures as illustrated in the case study. Using default parameters, Theatre is ideally suited for two or more genomic sequences that are fairly similar and approximately the same length. Theatre offers the user the choice of supplying an externally generated sequence alignment as an alternative to using ClustalW. The main aim of Theatre is to allow the user to look at the results of different programs and varying selections of features for display so that the results can be taken in at a glance. Further developments of this tool are being planned.
 |
AVAILABILITY
|
|---|
Theatre is accessible for use by registered members of the Bioinformatics
facilities of the UK HGMP-RC at
http://www.hgmp.mrc.ac.uk/Registered/Webapp/theatre/.
Registration is free to the academic community. Information
regarding the registration process is available at the HGMP-RC's
website at the following URL
http://www.hgmp.mrc.ac.uk/. For
more details contact:
support{at}hgmp.mrc.ac.uk or
yjedward{at}hgmp.mrc.ac.uk.
 |
ACKNOWLEDGEMENTS
|
|---|
We are grateful to our colleagues and collaborators (past and
present) for helpful considerations and the authors of EMBOSS
(CpGplot and Tfscan), ClustalW, GeneMark, RepeatMasker, BLAST
and MatInspector for their software.
 |
REFERENCES
|
|---|
- Wingender,E., Chen,X., Fricke,E., Geffers,R., Hehl,R., Liebich,I., Krull,M., Matys,V., Michael,H., Ohnhäuser,R. et al. (2001) The TRANSFAC system on gene expression regulation. Nucleic Acids Res., 29, 281283.[Abstract/Free Full Text]
- Praz,V., Perier,R., Bonnard,C. and Bucher,P. (2002) The Eukaryotic Promoter Database, EPD: new entry types and links to gene expression data. Nucleic Acids Res., 30, 322324.[Abstract/Free Full Text]
- Kel-Margoulis,O.V., Kel,A.E., Reuter,I., Deineko,I.V. and Wingender,E. (2002) TRANSCompel: a database on composite regulatory elements in eukaryotic genes. Nucleic Acids Res., 30, 332334.[Abstract/Free Full Text]
- Kolchanov,N.A., Ignatieva,E.V., Ananko,E.A., Podkolodnaya,O.A., Stepanenko,I.L., Merkulova,T.I., Pozdnyakov,M.A., Podkolodny,N.L., Naumochkin,A.N. and Romashchenko,A.G. (2002) Transcription Regulatory Regions Database (TRRD): its status in 2002. Nucleic Acids Res., 30, 312317.[Abstract/Free Full Text]
- Quandt,K., Frech,K., Karas,H., Wingender,E. and Werner,T. (1995) Matind and MatInspectornew fast and versatile tools for detection of consensus matches in nucleotide-sequence data. Nucleic Acids Res., 23, 48784884.[Abstract/Free Full Text]
- Rice,P., Longden,I. and Bleasby,A.J. (2000) EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet., 6, 276277.
- Wentworth,J.M., Schoenfeld,V., Meek,S., Elgar,G., Brenner,S. and Chatterjee,V.K. (1999) Isolation and characterisation of the retinoic acid receptor-alpha gene in the Japanese pufferfish, F.rubripes. Gene, 236, 315323.[CrossRef][Web of Science][Medline]
- Hardison,R.C. (2000) Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet., 16, 369372.[CrossRef][Web of Science][Medline]
- Reisman,D., Eaton,E., McMillin,D., Doudican,N.A. and Boggs,K. (2001) Cloning and characterization of murine p53 upstream sequences reveals additional positive transcriptional regulatory elements. Gene, 274, 129137.[CrossRef][Web of Science][Medline]
- Tompa,M. (2001) Identifying functional elements by comparative DNA sequence analysis. Genome Res., 11, 11431144.[Free Full Text]
- Cliften,P.F., Hillier,L.W., Fulton,L., Graves,T., Miner,T., Gish,W.R., Waterston,R.H. and Johnston,M. (2001) Surveying Saccharomyces genomes to identify functional elements by comparative DNA sequence analysis. Genome Res., 11, 11751186.[Abstract/Free Full Text]
- Bird,A., Tate,P., Xinsheng,N., Campoy,J., Meehan,R., Cross,S., Tweedie,S., Charlton,J. and Macleod,D. (1995) Studies of DNA methylation in animals. J. Cell Sci., Suppl., 19, 3739.
- Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST, a new generation of protein database search programs. Nucleic Acids Res., 25, 33893402.[Abstract/Free Full Text]
- Bairoch,A. and Apweiler,R. (2000) The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 2000. Nucleic Acids Res., 28, 4548.[Abstract/Free Full Text]
- Borodovosky,M. and McIninch,J. (1993) GENEMARKparallel gene recognition for both DNA strands. Comput. Chem., 17, 123133.[CrossRef]
- Higgins,D.G., Thompson,J.D. and Gibson,T.J. (1996) Using Clustalw for multiple sequence alignments. Meth. Enzymol., 266, 383402.[Web of Science][Medline]
- Wall,L., Christiansen,T. and Schwartz,R. (1996) Programming Perl, 2nd Edn. O'Reilly and Associates, Inc., Sabastopol, CA.
- Stein,L. (1998) Official Guide to Programming with CGI.pm. Wiley & Sons, Inc., New York, NY.
- Brudno,M., Kim,M.F., Do,C. and Batzoglou,S. (2002) The LAGAN Server, http://lagan.stanford.edu.
- Brudno,M. and Morgenstern,B. (2002) Fast and sensitive alignment of large genomic sequences. Proceedings of the IEEE Computer Society Bioinformatics Conference (CSB). IEEE Computer Society Press Inc., Los Alamitos, CA.
- Stoesser,G., Baker,W., van den Broek,A., Camon,E., Garcia-Pastor,M., Kanz,C., Kulikova,T., Leinonen,R., Lin,Q., Lombard,V. et al. (2002) The EMBL Nucleotide Sequence Database. Nucleic Acids Res., 30, 2126.[Abstract/Free Full Text]
- Benoit,V., Hellin,A.C., Huygen,S., Gielen,J., Bours,V. and Merville,M.P. (2000) Additive effect between NF-kappaB subunits and p53 protein for transcriptional activation of human p53 promoter. Oncogene, 19, 47874794.[CrossRef][Web of Science][Medline]
- Bienz-Tadmor,B., Zakut-Houri,R., Libresco,S., Givol,D. and Oren,M. (1985) The 5' region of the p53 gene: evolutionary conservation and evidence for a negative regulatory element. EMBO J., 4, 32093213.[Web of Science][Medline]
- Albor,A., Laborda,J. and Notario,V. (1994) Cloning of the Syrian hamster p53 gene: structural and functional characterization of the upstream promoter region. Mol. Carcin., 11, 176183.[Web of Science][Medline]
- Elgar,G., Clark,M.S., Meek,S., Smith,S., Warner,S., Edwards,Y.J.K., Bouchireb,N., Cottage,A., Yeo,G.S., Umrania,Y. et al. (1999) Generation and analysis of 25 Mb of genomic DNA from the puffer fish Fugu rubripes by sequence scanning. Genome Res., 9, 960971.[Abstract/Free Full Text]
- Aparicio,S., Chapman,J., Stupka,E., Putnam,N., Chia,J.M., Dehal,P., Christoffels,A., Rash,S., Hoon,S. et al. (2002) Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science, 297, 13011310.[Abstract/Free Full Text]
- Crollius,H.R., Jaillon,O., Dasilva,C., Ozouf-Costaz,C., Fizames,C., Fischer,C., Bouneau,L., Billault,A., Quetier,F., Saurin,W. et al. (2000) Characterization and repeat analysis of the compact genome of the freshwater puffer fish Tetraodon nigroviridis. Genome Res., 10, 939949.[Abstract/Free Full Text]
- Barton,G.J. (1993) ALSCRIPTA tool to format multiple sequence alignments. Protein Eng., 6, 3740.[Free Full Text]
- Parry-Smith,D.J., Payne,A.W.R., Michie,A.D. and Attwood,T.K. (1998) Cinemaa novel colour interactive editor for multiple alignments. Gene, 221, GC57GC63.[CrossRef][Web of Science][Medline]
- Jareborg,N. and Durbin,R. (2000) Alfrescoa workbench for comparative genomic sequence analysis. Genome Res., 10, 11481157.[Abstract/Free Full Text]
- Mayor,C., Brudno,M., Schwartz,J.R., Poliakov,A., Rubin,E.M., Frazer,K.A., Pachter,L.S. and Dubchak,I. (2000) VISTA: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics, 16, 10461047.[Abstract/Free Full Text]
- Schwartz,S., Zhang,Z., Frazer,K.A., Smit,A., Riemer,C., Bouck,J., Gibbs,R., Hardison,R. and Miller,W. (2000) PipMakera web server for aligning two genomic DNA sequences. Genome Res., 10, 577586.[Abstract/Free Full Text]
- Gottgens,B., Gilbert,J.G., Barton,L.M., Grafham,D., Rogers,J., Bentley,D.R. and Green,A.R. (2001) Long-range comparison of human and mouse SCL loci: localized regions of sensitivity to restriction endonucleases correspond precisely with peaks of conserved noncoding sequences. Genome Res., 11, 8797.[Abstract/Free Full Text]
- Sambrook,J.G., Russell,R., Umrania,Y., Edwards,Y.J.K., Campbell,R.D., Elgar,G. and Clark,M.S. (2002) Fugu orthologues of human major histocompatibility complex genes: a genome survey. Immunogenetics, 54, 367380.[CrossRef][Web of Science][Medline]
- Smith,S.F., Snell,P., Gruetzner,F., Bench,A.J., Haaf,T., Metcalfe,J.A., Green,A.R. and Elgar,G. (2002) Analyses of the extent of shared synteny and conserved gene orders between the genome of Fugu rubripes and human 20q. Genome Res., 12, 776784.[Abstract/Free Full Text]
- Miles,C., Elgar,G., Coles,E., Kleinjan,D.J., van Heyningen,V. and Hastie,N. (1998) Complete sequencing of the Fugu WAGR region from WT1 to PAX6: dramatic compaction and conservation of synteny with human chromosome 11p13. Proc. Natl Acad. Sci. USA, 95, 1306813072.[Abstract/Free Full Text]
- Davidson,H., Taylor,M.S., Doherty,A., Boyd,A.C. and Porteous,D.J. (2000) Genomic sequence analysis of Fugu rubripes CFTR and flanking genes in a 60 kb region conserving synteny with 800 kb of human chromosome 7. Genome Res., 10, 11941203.[Abstract/Free Full Text]

CiteULike
Connotea
Del.icio.us What's this?