ABSTRACT
We have found a novel transposon in the genome of Caenorhabditis elegans. Tc7 is a 921 bp element, made up of two 345 bp inverted repeats separated by a unique, internal sequence. Tc7 does not contain an open reading frame. The outer 38 bp of the inverted repeat show 36 matches with the outer 38 bp of Tc1. This region of Tc1 contains the Tc1-transposase binding site. Furthermore, Tc7 is flanked by TA dinucleotides, just like Tc1, which presumably correspond to the target duplication generated upon integration. Since Tc7 does not encode its own transposase but contains the Tc1-transposase binding site at its extremities, we tested the ability of Tc7 to jump upon forced expression of Tc1 transposase in somatic cells. Under these conditions Tc7 jumps at a frequency similar to Tc1. The target site choice of Tc7 is identical to that of Tc1. These data suggest that Tc7 shares with Tc1 all the sequences minimally required to parasitize upon the Tc1 transposition machinery. The genomic distribution of Tc7 shows a striking clustering on the X chromosome where two thirds of the elements (20 out of 33) are located. Related transposons in C.elegans do not show this asymmetric distribution.
Of the DNA or class II transposons (1 ), which transpose by excision and reintegration into the genome without an RNA intermediate, the Tc1/mariner family is the most widespread (2 -4 ). Members of this family have been found in fungi, insects, nematodes, vertebrates and recently, Tc1/mariner related sequences have even been identified in the human genome (5 -8 ). Genome sequencing projects reveal new groups of repetitive sequences of which many display the features of transposable elements of different classes (9 -13 ). However, most of these cases seem to correspond to non-autonomous, deleted transposons. Thus the mobility of the elements identified through the genome sequencing projects remains hypothetical.
In Caenorhabditis elegans, six groups of DNA transposons have been identified to date (14 ), Tc1 and Tc3 being the best characterized. Tc1 and Tc3 are members of the Tc1/mariner family (2 -4 ). Each Tc1/mariner transposon encodes a transposase which shares 35% identity with the transposases encoded by the other members of the family. The elements are delimited by inverted repeat sequences which are unrelated except for the last four nucleotides (5'-CAGT) which are conserved within the entire Tc1/mariner family. To mediate the transposition reaction, the transposase recognizes and binds to the terminal 30 bp of its cognate element (15 ). Thus, Tc1 and Tc3 transposases do not activate each others transposon (15 -18 ). Tc1 and Tc3 insert into TA dinucleotides which are duplicated upon integration (19 ). However, within a given genomic region, only a subset of TA dinucleotides are chosen as insertion sites, and the insertion patterns differ between Tc1 and Tc3 (20 ).
In the wild-type C.elegans strain Bristol N2, Tc1 transposition is detectable only in the somatic cells, whereas Tc3 transposition is undetectable in both soma and germline (21 ). However, in the strain Bergerac BO, as well as in other mutator strains, Tc1 transposition also occurs in the germline (21 -25 ). Some mutations, like mut-2 (r759) and mut-7 (pk204) also activate germline transposition of Tc3, Tc4 and Tc5 (26 ,27 ) (R.F. Ketting and R.H.A.P., unpublished observations). Recent in vivo and in vitro studies have shown that both Tc1 and Tc3 jump via a cut-and-paste process (19 ,28 ). After their excision, a double strand break is left at the donor site which is sealed by the host DNA repair machinery. This repair process often leaves characteristic footprints at the donor site (29 ). Interestingly, no transposition-proficient deletion derivative of Tc1 or Tc3 has been identified until now, whereas deleted versions of the P element in Drosophila or Ac in Maize, which presumably also transpose via a similar cut-and-paste mechanism, are widespread within the genomes of their hosts (30 ,31 ).
Recently, a new repetitive element consisting of two large inverted repeat sequences separated by a short unique sequence has been identified (13 ). The termini of the inverted repeats show strong similarity to the ends of Tc1 (36 out of 38 nt), suggesting that transposition of this novel element can be mediated by the Tc1 transposase. In this paper, we characterize this new repetitive element. We have tested its ability to transpose its target site choice and its polymorphic distribution among different C.elegans strains. Since we find that it is fully mobile in the germ-line, it rightly deserves to be categorized as a transposon; we call it Tc7. The data suggest that Tc7 shares with Tc1 all the sequence requirements to make use of the Tc1 transposition machinery.
Analysis of the Tc7 distribution in the genome of different C.elegans strains was performed by Southern hybridization as described in Sambrook et al. (32 ). Genomic DNAs extracted from Bristol N2 and RW7097 strains were digested with Sau96I, and run on an agarose gel. The samples were blotted on a nitrocellulose filter. An internal fragment of Tc7 was PCR-amplified from Bristol N2 genomic DNA using primers 8RR14 (5'-atgtagctcgtgatcaggcc-3') and 8RR15 (5'-gtgtagagtaatcttgagc-3'). The PCR product was cut out of an agarose gel. The agarose plug was placed in a perforated tube containing a glasswool filter and this was again put in a tube to elute and recover the DNA by centrifugation. The eluate was phenol extracted twice, chloroform extracted once, precipitated and used to make a radiolabelled probe by random primed labelling (33 ).
The stable transgenic Bristol N2 line NL818 (28 ), harbouring the Tc1-transposase gene under the control of a heat-shock promoter (pRP465) (15 ), was heat shocked for 2 or 4 h at 33oC. Genomic DNA was isolated as previously described (20 ) after a recovery at 18oC for 12 h. Somatic transposition of Tc7 in a 1 kb region of the gpa-2 gene was scored by nested PCRs (20 ) using primers specific for Tc7 (8RR12, 5'-gccgctttatcacttgccatg-3' and 8RR13, 5-acataggcctgatcacgagc-3') and primers specific for the gpa-2 target (AB3550 and AB5623) (20 ). The PCR products were analyzed on 1% agarose gels using the 1 kb DNA ladder (Gibco BRL) as a size marker. The PCR products were sequenced using the ABI PRISMtm. Dye terminator cycle sequencing kit (Perkin Elmer) following the manufacturer instructions. Sequencing products were run and analyzed on an ABI automatic sequenator.
Similarity searches through the C.elegans genome database (ACeDB, release of October 10, 1996) (34 ) and through GenBank/EMBL databases were performed using BLAST (35 ). Further sequence editing and analysis were made using the GCG package (University of Wisconsin, Madison).
Using the 54 bp inverted repeat sequence of Tc1 for a similarity search of the C.elegans genome database we found, in addition to several Tc1 elements, 10 hits which define a new class of repetitive elements (Table 1 , Tc7-1 to Tc7-10). Two of these hits (Tc7-1 and Tc7-2 in Table 1 ) correspond to Tc1-related sequences recently reported by Oosumi et al. (13 ). These elements are 921-923 bp sequences, made of two 345-347 bp inverted repeats and a conserved middle section that lacks a large ORF (Fig. 1 A). Of the terminal 38 bp of their inverted repeats, 36 bp were identical to the ends of Tc1 and like the transposons of the Tc1/mariner family they were flanked by TA dinucleotides (Fig. 1 B). These TA dinucleotides could be the result of a sequence duplication during integration as has been shown for Tc1 and Tc3 (20 ,28 ). Taken together these sequence features suggest that this element corresponds to a new transposon, which we call Tc7. No other sequence similarities were found between Tc7 and Tc1, nor between Tc7 and any other known repetitive element.
These 10 Tc7 elements were identical, with the exception of a 1 bp addition within the middle region of four sequences and a separate single base pair addition in one of the inverted repeats of two sequences (Fig. 1 A). Furthermore, two elements showed a small deletion of 6 and 44 bp at different locations. The copy number of Tc7 can not be estimated by extrapolation of the number of elements found in the database which contains ~50% of the entire genome because the Tc7 elements are not equally spread over the genome (see below).
Similarity searches through the GenBank/EMBL databases using the Tc7 sequence as a probe did not reveal related sequences in other species. However, 23 additional sequences were hit in the C.elegans genome database. These sequences define a more heterogeneous group of elements. Seven of them were delimited by the canonical 5'-CAGT nucleotides common to the Tc1/mariner family, and they were flanked by TA dinucleotides (Tc7-d1 to Tc7-d7, Table 1 ). These 808-931 bp long sequences shared 62-83% sequence identity with Tc7. They also had 0.35 kb inverted repeats at their extremities, but in contrast to Tc7 their terminal sequences showed more sequence divergence with the ends of Tc1. Finally, the last 16 hits shared sequence similarities with only one Tc7 end and were presumed to correspond to degenerate, incomplete Tc7 derivatives (Tc7-d8 to Tc7-d23, Table 1 ). However, these Tc7 derivatives had the canonical 5'-CAGT sequence next to a TA dinucleotide at their extremity. The 23 Tc7-related elements do not represent a distinct sub-group of Tc7 elements but they seem to have independently diverged over time.
Twenty of the 33 Tc7 and Tc7-related elements (61%) were located on the X chromosome whereas only 13 elements were found on the autosomes (Fig. 2 ). Only 25% of the C.elegans sequence in the data base used for the searches is derived from the X chromosome. This means that Tc7 is not equally spread over the entire genome; the elements are clustered on the X chromosome.
In mutator lines Tc1 transposition is activated in the germline and new inheritable insertions (or excisions) can be detected in the genome of originally isogenic strains. We analyzed the Tc7 content of eight mut-6 lines (25 ) derived from the same parental strain (RW7097) (25 ), which were cultivated in parallel for three months. DNA from these lines was digested with Sau96I, for which the Tc7 sequence does not contain a recognition site. The DNA was analyzed on a Southern blot using a Tc7 probe (Fig. 3 ). Interestingly, the genomes of the mut-6 lines and Bristol N2 contained a similar number of Tc7 copies, whereas the number of Tc1 elements is about 10 times higher in this mutator strain than in Bristol N2 (H.G.A.M.V.L. and R.H.A.P., unpublished observations). Between the lines maintained in parallel, most of the Tc7 bands were conserved. However, in several lines a few bands had been lost or gained, showing that Tc7 is mobile in the germ-line. Similar experiments using originally isogenic mut-7 lines (R.F. Ketting and R.H.A.P., unpublished observations) showed that Tc7, like Tc1, is also active in the germ-line of this mutator strain (data not shown). Based on these results we conclude that Tc7 is a transposable element.
Tc7 is an active transposon in the germline of several mutator strains, however we have not yet identified a Tc7 element with a gene coding for transposase. Since the binding site of the Tc1 transposase within Tc1 inverted repeats (17 ) is fully contained within the 38 bp sequences shared with the Tc7 extremities, we tested whether Tc7 could transpose when Tc1 transposase was expressed. Transgenic lines which harbour the Tc1 transposase gene under the control of a heat-shock promoter were generated. Transposition in the somatic cells was scored by PCR, using transposon specific primers and primers specific for a genomic target [the gpa-2 gene (36 )]. After heat-shock, several Tc7 insertions were detected in gpa-2, whereas without heat-shock, Tc7 transposition was much less frequent (35 versus 3 insertions; Fig. 4 ). Under the same conditions Tc1 transposition was also detectable. A comparison of the number of gpa-2 insertions scored for Tc1 verses Tc7 using the same transgenic line and identical DNA inputs in the PCR showed that Tc7 is only four times less active than Tc1 (data not shown).
Tc1 always inserts into TA dinucleotides, but uses only a small subset of all TA dinucleotides within the genome. Among those dinucleotides, a few are very often chosen and define `hotspots' for Tc1 insertion (20 ). Since Tc7 can use Tc1 transposase for its mobility, we determined the distribution of Tc7 insertion sites within gpa-2 and compared it to the distribution of Tc1 insertion sites. Forty two Tc7 insertion sites were sequenced (Fig. 5 ). As expected, Tc7 always inserted into TA dinucleotides. Furthermore, the distribution of the Tc7 insertions along the gpa-2 sequence perfectly fits the Tc1 insertion pattern. Most of the sequenced Tc7 insertions reside at TA dinucleotides that are also hit by Tc1, and the hottest insertion sites for Tc7 are the same as for Tc1.
Searches in the C.elegans genome database have revealed putative transposons or transposon fossils, including heterogeneous groups of elements presumably derived from the transposable elements Tc2 and Tc5 (5 ,13 ). The element analyzed here, which we call Tc7, and for which two variants have been described recently by Oosumi et al. (13 ), has all the hallmarks of a transposable element. Tc7 contains 345-347 bp inverted repeats of which the terminal 38 bp are nearly identical to the ends of Tc1 and is flanked by TA dinucleotides. Furthermore we have shown that Tc7 is mobile in the germline of independent mutator lines. However, Tc7 does not contain an ORF and is, therefore, presumably unable to make its own transposase. The 38 bp sequence shared with Tc1 encompasses the Tc1-transposase binding site and we have found that Tc1 transposase can promote Tc7 transposition.
We thank P.Borst, R.Ketting, G.Verlaan and S.Wicks for critical reading of the manuscript. This work was supported by a BIOTECH fellowship (B102CT94-8167) from the Commission of the European Communities to R.R. and a grant (5 R01 RR10082-02) from the NIH-NCRR to RHAP. RD is supported by the Wellcome Trust.
REFERENCES
+Present address: Catholic University of Louvain, Unit of Genetics, 5(bte3) Place Croix du Sud, 1348 Louvain-La-Neuve, Belgium


