ABSTRACT
We report that
Haemophilus influenzae
encodes a 268 amino acid ATP-dependent DNA ligase. The specificity of
Haemophilus
DNA ligase was investigated using recombinant protein produced in
Escherichia coli
. The enzyme catalyzed efficient strand joining on a singly nicked DNA in the
presence of magnesium and ATP (
K
m = 0.2
[mu]
M). Other nucleoside triphosphates or deoxynucleoside triphosphates could not
substitute for ATP.
Haemophilus
ligase reacted with ATP in the absence of DNA substrate to form a covalent ligase-adenylate intermediate. This nucleotidyl transferase reaction required a
divalent cation and was specific for ATP. The
Haemophilus
enzyme is the first example of an ATP-dependent DNA ligase encoded by a eubacterial genome. It is also the smallest member of the covalent nucleotidyl
transferase superfamily, which includes the bacteriophage and eukaryotic ATP-dependent polynucleotide ligases and the GTP-dependent RNA capping enzymes.
DNA ligases catalyze the joining of 5' phosphate-terminated donor strands to 3'-hydroxyl-terminated acceptor strands via three sequential
nucleotidyl transfer reactions (
1
,
2
). In the first step, nucleophilic attack by the ligase on ATP or NAD results in
formation of a covalent intermediate in which AMP is linked to the [epsilon] amino group of a lysine. The nucleotide is then transferred to the 5'-end of the donor polynucleotide to form DNA-adenylate, an inverted (5')-(5') pyrophosphate bridge structure,
AppN. Attack by the 3'-OH of the acceptor strand on the DNA-adenylate joins the two polynucleotides and liberates AMP.
The NAD-dependent ligases are found exclusively in eubacteria. Genes encoding NAD-dependent ligases have been identified in
Escherichia coli
(GenBank accession no. M30255),
Haemophilus influenzae
(U32789),
Thermus thermophilus
(M74792),
Zymomonas mobilis
(Z11910),
Rhodothermus marinus
(U10483),
Mycoplasma gentialium
(U39703) and
Synechocystis
(D90899). These proteins are of similar size (659-731 amino acids) and display extensive amino acid sequence conservation.
The ATP-dependent DNA ligases are found ubiquitously in eukaryotes and archaea (
2
,
3
). Animal cells contain multiple ATP-dependent DNA ligase isozymes encoded by at least three separate genes (
4
-
6
). ATP-dependent DNA ligases are also encoded by eukaryotic DNA viruses, e.g. the
poxviruses (
7
), African swine fever virus (
8
) and
Chlorella
virus PBCV-1 (
9
). The only ATP-dependent DNA ligases described in eubacteria are those encoded by the T-odd and T-even bacteriophages (T4, T6, T3 and T7).
The ATP-dependent DNA ligases belong to a superfamily of covalent nucleotidyl
transferases that includes the GTP-dependent eukaryotic mRNA capping enzymes (
10
). The ligase/capping enzyme superfamily is defined by a set of six short motifs
(Fig.
1
). The lysine within motif I (KxDGxR) is the active site of AMP transfer by the
eukaryotic and bacteriophage ligases (
11
-
13
) and GMP transfer by the capping enzymes (
14
-
17
). Conserved residues within motifs I, III, IV and V are critical for covalent
nucleotidyl transfer, as shown by mutational analyses (
18
-
21
). The recently reported crystal structure of T7 DNA ligase shows that the ATP
binding site is made up of conserved motifs I, III, IIIa, IV and V (
13
).
Oligonucleotide primers complementary to the 5'- and 3'-ends of the putative
H.influenzae
ligase gene were used to amplify the 268 amino acid open reading frame. Plasmid
p700 DNA (
22
; a generous gift of Dr Andrew Preston, University of Iowa) was used as the
template for a polymerase chain reaction (PCR) catalyzed by Pfu DNA polymerase
(Stratagene). The sequence of the 5' flanking primer was 5'-GGGCCC
A 500 ml culture of
E.coli
BL21/pET-Hin-ligase was grown at 37oC in Luria-Bertani medium containing 0.1 mg/ml ampicillin until the
A
600
reached 0.8. The culture was infected with bacteriophage [lambda]CE6 as described (
23
) and incubation was continued at 37oC for 4 h. Cells were harvested by centrifugation and the pellet was stored
at -80oC. All subsequent procedures were performed at 4oC. Thawed bacteria were resuspended in 50 ml lysis buffer [50
mM Tris-HCl, pH 7.5, 1 mM dithiothreitol (DTT), 10 mM EDTA, 10% sucrose]
containing 0.15 M NaCl. Lysozyme and Triton X-100 were added to final concentrations of 60 [mu]g/ml and 0.1% respectively and the sample was sonicated for 30 s.
Insoluble material was removed by centrifugation for 30 min at 18 000 r.p.m. in
a Sorvall SS34 rotor. SDS-PAGE analysis indicated that the expressed
Haemophilus
ligase protein was recovered in the pellet, not in the soluble lysate. The
insoluble material was resuspended with a Dounce homogenizer in 50 ml lysis
buffer. The centrifugation step was repeated and the pelleted material was
resuspended in 10 ml lysis buffer containing 0.5 M NaCl. This suspension was
centrifuged again. The recombinant
Haemophilus
ligase was partially soluble in 0.5 M NaCl. The 0.5 M NaCl pellet, which still
contained substantial
Haemophilus
ligase, was resuspended in 10 ml lysis buffer containing 1 M NaCl. Additional
Haemophilus
ligase activity was recovered in the 1 M NaCl supernatant fraction after
centrifugation. The 1 M NaCl pellet was then extracted with 10 ml lysis buffer
containing 2 M NaCl. The 2 M NaCl pellet was resuspended in 5 ml lysis buffer.
The 0.5, 1 and 2 M NaCl supernatants were combined (30 ml, containing 4 mg
protein) and dialyzed against 0.1 M NaCl in buffer A (50 mM Tris-HCl, pH 8.0, 1 mM EDTA, 2.5 mM DTT, 10% glycerol, 0.1% Triton X-100). This material was applied to a 1.5 ml column of
phosphocellulose that had been equilibrated with 0.1 M NaCl in buffer A. The
column was washed with the same buffer and then eluted step-wise with buffer A containing 0.2, 0.25, 0.5 and 1.0 M NaCl. The activity
of the column fractions was monitored by label transfer from [[alpha]-
32
P]ATP to protein.
Haemophilus
ligase adenylyltransferase activity was recovered in the 0.5 M fraction (0.6 mg
protein). An aliquot of the 0.5 M phosphocellulose fraction was applied to a
4.8 ml 15-30% glycerol gradient containing 50 mM Tris-HCl, pH 8.0, 2 mM DTT, 0.5 M NaCl and 0.1% Triton X-100. The gradient was centrifuged at 50 000 r.p.m. for 24 h
at 4oC in a Beckman SW50 rotor. Fractions were collected from the bottom of the
tube. Marker proteins (bovine serum albumin and cytochrome c) were sedimented
in a parallel gradient. Enzyme fractions were stored at -80oC and thawed on ice just prior to use. Protein concentrations were
determined using the BioRad dye binding assay, with bovine serum albumin as the
standard.
Reaction mixtures (10 [mu]l) containing 60 mM Tris-HCl, pH 8.0, 0.5 mM DTT, 10 mM MgCl
2
, 0.16 [mu]M [[alpha]-
32
P]ATP and enzyme were incubated for 10 min at 22oC, then halted by adding SDS to 1% final concentration. The samples were
electrophoresed through a 12% polyacrylamide gel containing 0.1% SDS. Label
transfer to the 31 kDa
Haemophilus
ligase polypeptide was visualized by autoradiographic exposure of the dried gel
and was quantitated by scanning the gel with a Fujix BAS1000 Bio-Imaging Analyzer.
The standard substrate used in ligase assays was a 36 bp DNA duplex containing a
centrally placed nick (
24
). This DNA was formed by annealing two 18mer oligonucleotides to a
complementary 36mer strand. The 18mer constituting the donor strand was 5'-
32
P-labeled and gel purified as described (
24
). The labeled donor was annealed to the complementary 36mer in the presence of
a 3'-OH-terminated acceptor strand in 0.2 M NaCl by heating at 70oC for 10 min, followed by slow cooling to room
temperature. The molar ratio of the 18mer donor to 36mer complement to 18mer
acceptor strands in the hybridization mixture was 1:4:4.
Reaction mixtures (20 [mu]l) containing 50 mM Tris-HCl, pH 8.0, 5 mM DTT, 10 mM MgCl
2
, 1 mM ATP, 0.2 pmol 5'-
32
P-labeled DNA substrate and enzyme were incubated at 22oC. Reactions were initiated by addition of enzyme and halted by the
addition of 1 [mu]l 0.5 M EDTA and 5 [mu]l formamide. The samples were heated at 95oC for 5 min and then electrophoresed through a 15% polyacrylamide
gel containing 7 M urea in TBE (90 mM Tris-borate, 2 mM EDTA). The labeled 36mer ligation product was well resolved
from the 5'-labeled 18mer donor strand. The extent of ligation [36mer/(18mer +
36mer)] was determined by scanning the gel using a Fujix BAS1000
phosphorimager.
The
H.influenzae
open reading frame URF1 encoding a ligase-like protein was amplified by PCR and cloned into a T7 RNA polymerase-based bacterial expression vector. The pET-Hin-ligase plasmid was introduced into
E.coli
BL21. Expression of the target gene was induced by infection with bacteriophage
[lambda]CE6. This phage contains the gene encoding T7 RNA polymerase. A prominent 31 kDa polypeptide was detectable by SDS-PAGE in whole cell extracts of infected bacteria (not shown).
This polypeptide was not present when bacteria containing the pET vector alone
were infected with [lambda]CE6. After centrifugal separation of the crude lysate, the 31 kDa
protein was recovered in the insoluble pellet fraction. This protein was
partially solubilized by sequential extraction of the pellet fraction with
buffers containing 0.5, 1 and 2 M NaCl.
The initial step in DNA ligation involves formation of a covalent enzyme-adenylate intermediate, EpA. The formation of EpA by ATP-dependent DNA ligases can be detected with high sensitivity and
specificity by label transfer from [[alpha]-
32
P]ATP to the enzyme. In order to assay adenylyltransferase activity of the
expressed
H.influenzae
protein, we incubated the soluble lysate and the salt-extracted material from the insoluble fraction of [lambda]CE6-infected BL21/pET-Hin-ligase cells in the presence of [[alpha]-
32
P]ATP and a divalent cation. The salt extracts formed an SDS-stable nucleotide-protein adduct that migrated as a 31 kDa species during SDS-PAGE (Fig.
3
, Hin Ligase). No labeling of this protein was detected when the enzyme fraction
was incubated with [[alpha]-
32
P]GTP (not shown). Labeling of the 31 kDa polypeptide was not detected using
salt extracts prepared from [lambda]CE6-infected bacteria that lacked the
Haemophilus
ligase plasmid (Fig.
3
, Control).
Figure
The adenylyltransferase remained soluble after dialysis of the NaCl extract
against buffer containing 0.1 M NaCl. The enzyme was applied to
phosphocellulose and was recovered by step elution with 0.5 M NaCl. When the
phosphocellulose preparation fraction was centrifuged through a 15-30% glycerol gradient in 0.5 M NaCl, a single peak of adenylyltransferase
activity was detected (Fig.
4
). We estimated a sedimentation coefficient of 3 S relative to marker proteins
sedimented in a parallel gradient. This suggested that the
H.influenzae
adenylyltransferase is a monomer of the 31 kDa polypeptide.
Figure
We assayed the ability of the recombinant
H.influenzae
protein to seal a 36mer synthetic duplex DNA substrate containing a single nick
(
24
). Ligase activity was evinced by conversion of the 5'-
32
P-labeled 18mer donor strand into an internally labeled 36mer product (Fig.
5
). The DNA ligase activity profile across the glycerol gradient paralleled that
of enzyme-adenylate formation (Figs
4
and
5
). These results demonstrate that the
H.influenzae
protein is indeed a DNA ligase. Further characterization of the ligase was
performed using the glycerol gradient preparation (peak fraction 17).
Figure
The extent of ligation of the nicked duplex during a 10 min incubation at 22oC in the presence of 1 mM ATP increased linearly with added enzyme (Fig.
6
). The reaction saturated with 77% of the labeled donor strand converted to
36mer in 10 min. This upper limit of ligation probably reflected incomplete
annealing of all three component strands to form the nicked substrate. Ligation
by 0.25 [mu]l enzyme in the presence of 1 mM ATP and 10 mM MgCl
2
was linear with time up to 10 min (not shown). Ligation depended on a divalent
cation in excess of the input 1 mM ATP; activity was enhanced as Mg was
increased from 2 to 20 mM (Fig.
7
A). The divalent cation requirement was satisfied by 10 mM Mn, but not by 10 mM
Co, Ca, Cu or Zn (Fig.
7
B).
Figure
Figure
Figure
The rate of ligation was dependent on the concentration of ATP included in the
reaction (Fig.
8
A). A
K
m
of 0.2 [mu]M ATP was calculated from a double-reciprocal plot of the data. Other rNTPs at 100 [mu]M concentration could not substitute for ATP. dATP was also
incapable of supporting strand joining (Fig.
8
B).
The structure of the ligation substrate was altered such that the 3'-hydroxyl-terminated acceptor strand was separated from the 5'-phosphate donor terminus by a 1 nt gap (
24
). The strand joining activity of
H.influenzae
ligase on a 1 nt gap substrate was 4% of the activity of a nicked duplex DNA
(Fig.
6
). The implication is that the 3'-OH must be positioned fairly precisely relative to the 5'-phosphate donor terminus for ligation to occur.
The
H.influenzae
ligase reacted specifically with [[alpha]-
32
P]ATP. The amount of enzyme-AMP complex formed during a 10 min incubation at 22oC in the presence of 0.16 [mu]M [[alpha]-
32
P]ATP and 10 mM MgCl
2
was proportional to the amount of added enzyme with 0.063-2 [mu]l of the glycerol gradient preparation (data not shown). The extent
of EpA formation by 0.25 [mu]l enzyme did not vary as a function of [[alpha]-
32
P]ATP concentration from 25 to 2000 nM (data not shown). The concentration of
available adenylation sites in the glycerol gradient enzyme preparation was 64
nM. This value is a minimal estimate of the concentration of active ligase
molecules, as it does not take into account any ligase molecules that are
already in the AMP-bound state. Using this value, we calculated from the data in Figure
6
that 10 fmol DNA substrate were ligated per fmol input ligase (as EpA units).
EpA formation depended on a divalent cation cofactor. This requirement was
satisfied by either 5 mM magnesium or 5 mM manganese and to a lesser extent by
5 mM cobalt (Fig.
9
). Manganese was a more effective cofactor than magnesium at 0.5-1 mM. Zinc supported EpA formation in a narrow concentration range from 1
to 2 mM, but activity decreased sharply at 5-10 mM. Calcium was a poor effector at all concentrations tested between 1
and 10 mM; the yield of EpA at 10 mM calcium was ~10% of the optimal value formed with magnesium or manganese. Copper was
inactive at all concentrations examined (Fig.
9
).
Figure
A
H.influenzae
gene encoding a putative ATP-dependent DNA ligase was identified on the basis of sequence similarity to
members of the ligase/capping enzyme superfamily. We show that the 268 amino
acid gene product is an ATP-dependent DNA ligase. This was achieved by expressing the
H.influenzae
protein in bacteria and characterizing its enzymatic properties.
Haemophilus influenzae
ligase, like other polynucleotide ligases, catalyzes strand joining via an
enzyme-AMP intermediate. The
H.influenzae
enzyme displays strict specificity for ATP as the nucleotide cofactor. dATP is
inactive.
Haemophilus
ligase resembles the T4, vaccinia virus,
Chlorella
virus and eukaryotic cellular enzymes in its discrimination of the NTP sugar
moiety (
9
,
21
,
24
,
25
). The observed
K
m
of
Haemophilus
DNA ligase for ATP (0.2 [mu]M) is comparable with values reported for mammalian DNA ligase I (0.5-1 [mu]M) (
2
). The reported
K
m
values of
Chlorella
virus ligase (75 [mu]M), vaccinia virus ligase (95 [mu]M) and mammalian ligase II (40 [mu]M) are significantly higher (
9
,
24
,
26
). The high efficiency of
H.influenzae
DNA ligase in strand joining across a nick in duplex DNA contrasts sharply with
the low efficiency of ligation across a 1 nt gap. Vaccinia ligase,
Chlorella
virus ligase and yeast CDC9 ligase display similar properties (
9
,
24
,
27
).
The
H.influenzae
ligase is the smallest DNA ligase described to date. Insofar as
H.influenzae
ligase is also smaller than any known mRNA capping enzyme (
18
,
28
), it may constitute the catalytic core of the nucleotidyl transferase
superfamily.
Haemophilus influenzae
ligase includes the six conserved motifs that define the family (
10
), but contains no additional amino acids at the C-terminus downstream of motif VI. It contains only 40 amino acids N-terminal of the presumptive active site, Lys41. Note that the active
site of the second smallest ligase, the 298 amino acid
Chlorella
virus enzyme, is at residue Lys27. The compaction of the
H.influenzae
ligase relative to the
Chlorella
virus protein is achieved by shortening the spacing between motif V and motif
VI. This intervening segment is 62 amino acids in
H.influenzae
ligase versus 92 residues for
Chlorella
virus ligase. The biochemical function of conserved motif VI is not clear at
present. It is surmised from the crystal structure of the T7 DNA ligase that
motif VI is not a component of the ATP binding site (
13
). Conceivably, motif VI may be involved in binding of ligase to the nucleic
acid substrate.
Haemophilus influenzae
ligase is the first example of an ATP-dependent DNA ligase encoded by a eubacterial genome. The
H.influenzae
gene encoding the ATP-dependent ligase resides within a 9.4 kbp
Eco
RI genomic restriction fragment that was cloned and sequenced by Preston
et al
. (
22
). Eleven open reading frames are contained within this DNA fragment. Ten of
these are homologous to previously identified bacterial genes. The one reading
frame that had no apparent eubacterial homolog is the one that we have shown
encodes the DNA ligase. This gene was previously named URF1 (unknown reading
frame 1) (
22
). Preston and Moxon (unpublished results) have documented the presence and
integrity of the full-length URF1 open reading frame in four different strains of
H.influenzae
: these include the pathogenic b serotype strains RM7004 and Eagan, the non-pathogenic Rd strain and the KW20b strain (a multiply passaged derivative
of Rd that was transformed to capsule type b with DNA from strain Eagan). We
propose that the
H.influenzae
gene encoding the ATP-dependent ligase should henceforth be named
ligA
and further suggest that the
H.influenzae
gene encoding a homolog of
E.coli
NAD-dependent DNA ligase be named
ligN
.
Previous efforts to disrupt the
H.influenzae
ligA
gene were unsuccessful, i.e. no viable bacteria containing a
ligA
deletion could be recovered (
22
). This cannot be attributed to a locus-specific failure to engage in genetic recombination, because Preston
et al
. (
22
) were able to disrupt the gene immediately flanking
ligA
. A reasonable inference from these observations is that the
ligA
gene product is essential for growth of
H.influenzae
. A role for the ATP-dependent DNA ligase during DNA replication, repair and/or recombination
is plausible; delineating these roles will require the isolation of temperature-sensitive
ligA
mutants.
The occurrence of a eukaryotic-type ligase in a eubacterial genome raises interesting evolutionary
questions.
Haemophilus influenzae
is an obligate parasite that colonizes human respiratory mucosa and grows in
culture only on rich medium. Its 1.8 Mb genome encodes many fewer genes than
does the 4.7 Mb
E.coli
genome (
29
). The fastidious growth requirements stem from the lack of key enzymes involved
in the metabolism of carbohydrates, pyrimidines, amino acids and coenzymes (
3
0). Thus,
H.influenzae
depends on the human host to provide various nutrients. The parasitic
relationship may also entail rare occurrences of horizontal gene transfer from
the eukaryotic host to the bacterium. This was suggested by the finding that
two
H.influenzae
proteins are homologous to eukaryotic amino acid transporters, but unrelated to
any bacterial transporters (
3
0). This raises the possibility that the
ligA
gene may have been acquired by
Haemophilus
from a eukaryote or a eukaryotic virus.
*To whom correspondence should be addressed. Tel: +1 212 639 7145; Fax: +1 212
717 3623; Email: s-shuman@ski.mskcc.org







REFERENCES
Return
