ABSTRACT
In order to gain insights into the relationship between spatial organization of the genome and genome function we have initiated studies of the co-linear Sh2/A1-homologous regions of rice (30 kb) and sorghum (50 kb). We have identified the locations of matrix attachment regions (MARs) in these homologous chromosome segments, which could serve as anchors for individual structural units or loops. Despite the fact that the nucleotide sequences serving as MARs were not detectably conserved, the general organizational patterns of MARs relative to the neighboring genes were preserved. All identified genes were placed in individual loops that were of comparable size for homologous genes. Hence, gene composition, gene orientation, gene order and the placement of genes into structural units has been evolutionarily conserved in this region. Our analysis demonstrated that the occurrence of various `MAR motifs' is not indicative of MAR location. However, most of the MARs discovered in the two genomic regions were found to co-localize with miniature inverted repeat transposable elements (MITEs), suggesting that MITEs preferentially insert near MARs and/or that they can serve as MARs.
The higher level structural organization of the genome is believed to be important both for compaction of chromosomes in the nucleus and for regulating genome functions. According to the chromatin domain model (1-3) the genome is folded into structural domains (loops), the bases of which are attached to a proteinaceous nuclear skeleton (matrix). Such loops are believed to provide an additional 1000-fold compaction of the genome, necessary for its accommodation into the interphase nucleus. The DNA sequences (matrix attachment regions, or MARs) anchoring loops of heterogeneous size to this matrix are considered to be important structural elements of the genome. Their ability to affect gene expression has been shown (3-6). In plants MARs have been reported to play a role in reducing both position effects and homology-dependent gene silencing (7-12). Besides being structural elements, therefore, MARs are believed to bear functional information as well.
For obvious reasons researchers have focused mainly on the gene-containing fraction of the genome. Hence, most of the existing information on the structural organization of the genome describes individual genes and their immediate surroundings. Information on domain organization and chromosome folding at a supragenic level in animal systems is limited to only a few studies: a 320 kb continuum of the Drosophila rosy-Ace locus (13), an 800 kb region of Drosophila chromosome 1 (14,15), the 240 kb amplicon of the chinese hamster dihydrofolate reductase gene (16) and 200 kb around the mouse heavy chain IgH locus (17). The first study in plants devoted to the higher order structural organization of a large chromosomal continuum was focused on a 290 kb region around maize Adh1 (18). The location of MARs along this chromosomal segment defined several loops of various sizes and a strong, although not perfect, correlation between MAR locations and the junctions of repetitive and low copy number DNA blocks. The distribution of the different classes of DNA within this continuum (19) with respect to the structural loops revealed that the long stretches of mixed classes of highly repetitive DNAs are segregated into topologically sequestered units (18). It was interesting, therefore, to study the possible loop folding of grass genomic regions void of highly repetitive DNA blocks in their intergenic space, as is the case for the Sh2/A1-homologous regions of sorghum and rice (20-22).
Earlier we showed that the Adh1 gene was positioned in an individual loop (18). However, lack of information regarding the presence and location of other genes in the region did not allow us at that time to pursue a possible correlation between the structural organization of genes and their function. We have addressed this question here by examining the putative higher level structural organization of two large genomic continuums of known gene composition. This is the first attempt to compare the possible spatial organization of homologous genomic regions in two different species.
A characteristic feature of eukaryotic genomes is the enormous variation in genome size, which bears little relation to differences in organism complexity or to the number of genes that code for proteins (23). Much of this variation is due to differences in the amount of repetitive DNA. Recombinational mapping of different grass genomes has indicated extensive conservation of both gene content and gene map order (24,25), despite great variation in genome size and chromosome number (26). Recent studies have shown that interspecies gene content and order have often been preserved also at the 200-500 kb level (20-22,27-29).
With this background we have asked three questions. What will be the potential of large chromosomal regions of known sequence composition to fold into individual structural units? Given the micro-colinearity of grass genomes, will the folding of colinear regions into structural units follow a similar pattern? Will there be DNA sequence motifs that are common and/or conserved in plant MARs?
About 50 and 30 kb of sorghum and rice DNA respectively, containing A1/Sh2-homologousregions, were screened for the location and distribution of MARs as anchors for the bases of putative loops (domains). Several MAR-containing fragments (four in rice and seven in sorghum) were isolated and their sequences compared with each other and with reported characteristics of animal MARs. The results indicated significant preservation of structural organization but no detected conservation of sequences or motifs responsible for folding of this region.
Subclones of rice and sorghum bacterial artificial chromosomes (BACs) containing Sh2 and A1 homologs have been described previously (20,21; Fig. 1). Restriction enzymes and T4 polynucleotide kinase were from New England Biolabs. Calf intestine alkaline phosphatase (Pharmacia) was used for dephosphorylation.
Leaves from 3-week-old rice (variety Teqing) and sorghum (cultivar BT×623) seedlings were used for isolation of nuclei. Excised leaves were immediately frozen in liquid nitrogen and finely ground in a mortar. Nuclear isolation was according to the protocol established previously (30). Aliquots containing 3-5 A260 units of nuclei were dispersed in 50 mM Tris, 0.1 M NaCl, 5 mM MgCl2 buffer, pH 7.2, in 70% glycerol and stored for several months at -80°C. Nuclear matrices were prepared by the high salt extraction procedure, as described (30).
Each of the subclones was digested with a combination of restriction enzymes as shown in the legend to Figure 2. The fragments resulting from restriction were dephosphorylated and end-labeled with [[gamma]-32P]ATP (Amersham) in One-Phore All Buffer PLUS (Pharmacia). The in vitro binding method was used to screen the rice and sorghum clones for the presence of MARs, essentially as described previously (30). In pilot experiments the amount of nuclear matrices and competitor DNA were established with the purpose of carrying out the binding assays under reproducible conditions and to eliminate weak and background associations. In a typical assay 100 µg/ml extensively sheared Escherichia coli DNA were mixed with nuclear matrix aliquots, corresponding to 0.5 A260 units of sorghum nuclei and 0. 35 A260 units of rice nuclei per binding sample. After a 10 min incubation the labeled restriction fragments were added and the binding reaction was usually carried out for several hours or overnight at room temperature with shaking. The separation of matrix-bound (B, bound) from unbound DNA was accomplished by centrifugation for 2 min in an Eppendorff centrifuge. The sedimented fraction was treated with proteinase K in TE buffer containing 1% SDS for 3-4 h at 37°C and loaded in 1% agarose gels, next to a sample of total (T) input DNA. The amount of input DNA loaded on the gel was 50% of that used in the binding reaction, except the input DNA shown with rice clone 1, which represents 25% of the labeled probe used in the assay. It was necessary to load a lower amount of the input DNA in the gel in order to achieve a better resolution of the closely fractionating restriction fragments. Identification of bound fragments was facilitated by the presence of detailed restriction maps of the two regions (21).
A total of 30 035 bp (GenBank accession no. U70541) and 42 447 bp (GenBank accession no. AF010283) of the colinear Sh2/A1-homologous regions of rice and sorghum respectively have been sequenced (20,22). After location of the MARs, therefore, we were able to immediately analyze their sequence composition. The GCG sequence analysis software package, Version 8. 0 (Genetics Computer Group Inc., University Research Park, Madison, WI), was used. The distributions of A/T and various `boxes' were estimated by the Window program with default setup of window size at 100 and shift increment at 3.
Screening of a BAC library containing rice DNA with a maize Sh2 probe has led to the isolation of a clone containing a 130 kb insert (21). Detailed molecular analysis of this region indicated that a homolog of the maize A1 gene was present downstream of the Sh2 homolog. The order of the two genes, as well as the direction of their transcription, was the same as in maize (31). A major difference, however, was that the two genes were 19 kb apart in rice and 140 kb apart in maize. Subsequently, ~30 kb of the region covering the two genes in rice was completely sequenced and a third gene, gene X (encoding a putative transcription factor), was discovered between the Sh2 and A1 homologs (20,21; Fig. 1A). Numerous elements with structures corresponding to miniature inverted repeat transposable elements (MITEs) (32), a simple sequence repeat and a direct tandem duplication of 1432 bp were also identified (20).
Four overlapping clones, covering 30 kb of the rice region encompassing the Sh2 and A1 homologs, were individually screened for the presence and location of MARs. The fragments shown in the right hand lanes represent DNA preferentially retained by the matrix and are shown as boxes with a star (Fig. 1A).
Clone 1, containing a SacI-XhoI insert covering 15 kb at the 5'-end of the region, was digested with a combination of restriction enzymes and was tested for MAR activity. In the first panel of Figure 2A the total (T) input fragments before binding and those bound (B) to isolated rice nuclear matrices in the presence of competitor DNA are shown. A strong binding was observed for the 2.0 kb fragment generated with MluI and BamHI. This region immediately flanks the rice Sh2 homolog at its 3'-end. A weaker binding is displayed by a 1.2 kb fragment located 5' of the gene. These two neighboring attachment sites delineate a putative loop of ~6.6 kb containing the Sh2 homolog. It should be pointed out that different functions have been suggested for strong and weak MARs (2,15) and that, in at least one case, involvement of a strong MAR in attenuating transgene silencing has been reported (9). While this is certainly an interesting issue, it is beyond the scope of this work. Hence, weak and strong MARs are defined arbitrarily in this study, based solely on the apparent differences in band density of a bound fragment relative to the input probe.
An adjacent clone (clone 3), covering ~13 kb downstream, contained two large MAR fragments (Fig. 2A). The ~4.8 kb XhoI-BamHI matrix binding fragment covered a region 3' of gene X, enclosing it in an individual loop (Fig. 1A). This 4.8 kb MAR was further mapped. After digestion of the overlapping ~13 kb BamHI clone (clone 2) and testing the matrix binding capacity of the resulting smaller fragments only a 1.1 kb EcoRI fragment bound to the matrix (Fig. 2A). The adjacent 3.1 kb BamHI/HpaI MAR is seen in both clone 3 and in the overlapping clone 4.
When a sorghum BAC library was screened with a maize Sh2 probe (21) a positive clone containing 80 kb of DNA was selected and characterized in detail. As in the case of rice, the presence, order and direction of transcription of the Sh2 and A1 homologs were the same as in maize and, as in rice, the distance between the two genes was ~19 kb. In addition, a direct duplication of A1 was discovered ~10 kb downstream (Fig. 1B). A gene homologous to gene X was discovered between the Sh2 and A1 homologs in sorghum as well, further supporting the genetic colinearity of the region (21,22). Several MITEs belonging to different classes of mobile elements, a solo long terminal repeat (LTR) of a retroelement and a simple sequence repeat were identified in intergenic locations (22; Fig. 1B).
Five adjacent clones, covering ~50 kb of the region, were screened for MARs. Matrix complexes prepared from sorghum leaf nuclei were used in the binding assay. The results of these experiments are shown in Figure 2B. Clone 1, containing a 3.5 kb sorghum insert and located at the most 5'-end of the region studied, did not show any matrix binding capacity. In the adjacent 15 kb region (clone 2) one weak and two prominent binding sites were revealed: on the 2.6 kb KpnI-XhoI intergenic fragment separating the Sh2 and X homologs and on the 0.95 kb KpnI-PacI fragment 5' of the Sh2 homolog. A weakly bound 0.5 kb band corresponds to the BamHI-KpnI fragment located immediately upstream of the 0.95 kb fragment. It is possible that these adjacent binding fragments are part of the same anchorage site. Clone 3, covering a 1.2 kb region between two large clones, did not reveal any potential attachment site. Clone 4 contained an insert with ~19 kb of the region. Two genes were located in clone 4: the putative transcription factor gene, gene X and an A1 homolog (A1-a). Two MARs were identified on the 1.5 kb EcoRV fragment at map positions 22400-23970 and on the adjacent 0.5 kb region, at positions 23970-24680. These two attachment points may act in concert at the 3'-end of a putative loop enclosing the gene X homolog. A third attachment point, located some 9 kb downstream, was identified in a 1.1 kb EcoRI-BamHI fragment. It mapped to a region occupied by the solo LTR and closed a putative loop containing the A1-a gene.
In the 3'-end of the region a duplicated A1 homolog (A1-b) was located. The fifth clone tested for matrix binding contained an 8.5 kb insert encompassing A1-b. After digestion with BamHI, HindIII and EcoRI, a 0.7 kb EcoRI-BamHI fragment bound to the matrix (last panel in Fig. 2B). It enclosed A1-b in a separate putative loop of 8.3 kb. A weak binding was also observed for the 0.4 kb EcoRI fragment located immediately upstream of the 0.7 kb MAR, suggesting that these two fragments also represent a single attachment point.
Hybridization and sequence analysis of the two colinear genomic regions indicated that the sequence homology between the two species was limited to the regions occupied by genes (20-22). The regions containing MARs did not show conservation of their primary sequence (20, 22). It was unexpected, therefore, that the general placement pattern of the MARs, with respect to the neighboring genes, was so remarkably well preserved.
Since MARs were initially identified (33-35), the nature of the DNA sequences responsible for MAR activity has not been fully defined. Comparison of the sequences of numerous MAR-containing DNA fragments has indicated that they are A/T rich. This is usually displayed as motifs containing various combinations of A and T residues: A boxes, T boxes, base unpairing sequences (BURs), consensus elements for topoisomerase II, etc. (34-38). Therefore, we compared the composition of these plant MARs with regard to criteria established earlier for MARs. The ability to analyze several MARs belonging to two different plant species and the availability of the primary sequence of these large genomic regions permitted a detailed evaluation of MAR composition.
When the rice and sorghum MARs were examined for the presence of A or T residues a general tendency for enrichment in A/T was observed: all MAR-containing fragments were 70-80% A/T. Comparing the A/T profile of the whole region, however, showed a high level of A/T in several locations across the entire region in both species (Fig. 1). Evidently, AT-richness per se could not be a reliable criterion for MAR prediction. All three A1 homologs and both gene X homologs displayed a low A/T content. In contrast, the Sh2 homologs were rather high in A/T (Fig. 1).
In the search for a characteristic MAR sequence several motifs have been reported as elements clustered in MAR regions: the `A box' (AATAAAYAAA), the `T box' (TTWTWTTWTT), `BURs' (AATATATT/AATATT) and topoisomerase II consensus binding sites from Drosophila (GTNWAYATTNATNNG) or mouse (RNYNNCNNGYNGKTNYNY) (reviewed in 38).
Both DNA strands of each of the rice and sorghum MARs were tested for the presence of such motifs. As shown in Figures 3 and 4, many of these motifs were found in the MAR-containing fragments, suggesting similar overall sequence composition for MARs independent of their species of origin. A notable exception, though, is the absence of a Drosophila topoisomerase II consensus motif from the MARs, as well as from the entire tested regions. A similar lack of this consensus motif has just been reported for the MARs located in the plastocyanin gene region of Arabidopsis thaliana (39). A motif, believed to be a specific marker for MARs in Arabidopsis, has been deduced (39). However, comparison of this consensus with the sequences from the sorghum and rice regions failed to uncover preferential appearance of this motif in the MARs (not shown).
In maize Sh2 and A1 map ~140 kb apart (31). Molecular analysis of the comparable loci in two other grasses, rice and sorghum, has indicated that gene arrangement and composition are conserved in these regions for these species (20-22). The close physical and recombinational linkage of these two genes (31,42) makes this region particularly informative for studies of intergenic chromosomal organization. The complete sequence information available for the regions and the identification of individual elements and genes (20-22) make it an excellent model for studies of a possible relationships between genome structural organization and function. An unexpected feature of this region in both species was the absence of retroelements in intergenic spaces, in contrast to their abundance in maize (43). The only exception was a solo LTR present in sorghum. This retroelement segment was found to also carry the MAR that could segregate the duplicated A1 homologs into individual loops. Earlier we identified MARs in regions of repetitive DNA around maize Adh1 (18) and a few of them were shown to be carried by retroelements (43). It should be noted that not all members of the same retrotransposon family displayed matrix binding activity. Recently MAR activity has been found inside another retroelement, part of the transformation booster sequence (TBS) from Petunia (44). The authors suggested a role for this MAR in increasing the transformational and/or recombinational activity of TBS-containing plasmids.
Mapping the MARs along the chromosome continuums in the two species allowed us to uncover commonalities in the predicted organization of the two genomes. First, all genes present in the region were placed in individual loops, defined by neighboring MARs. MARs identified in two adjacent restriction fragments were considered as parts of the same anchoring site. Each of the duplicated sorghum A1 homologs was found in a separate loop, separated by the MAR located in the solo LTR. Analysis of data in the literature seemed to suggest that placement of genes into individual relatively small (3-10 kb) loops is a common pattern in plant genomes. Thus all four genes in the 17.1 kb of the soybean lectin locus are segregated into separate domains (7) the tomato heat shock cognate 80 gene (45) and the maize proton H+ ATPase gene (Avramova et al., unpublished results) are placed in individual putative loops. The [beta]-phaseolin gene has been found in a 3 kb loop, the smallest reported so far (46). A recent study of 16 kb in the A.thaliana plastocyanin region, containing seven different genes, provided insight into the loop organization of a small plant genome (39). In this case each putative structural loop contained two neighboring genes. The possible significance of this type of gene arrangement within a loop remains to be studied.
Second, although the sizes of the proposed loops vary, homologous gene domains are comparable in sorghum and rice. The size of the loop containing the rice A1 homolog has not been determined, because its 3'-end was beyond the sequence available on our clone. A common feature of all three A1 homologs, however, is that their promoters are placed relatively far from the base of the loop, with various repetitive elements present between their 5' MARs and the respective transcription start sites. The Sh2 homologs of both rice and sorghum are placed in smaller predicted loops, while the genes for the putative transcription factor appear to occupy the largest structural domains in the regions.
The authors are indebted to Phillip SanMiguel for providing us with the sorghum clones and their restriction maps. The work was supported by research grants from the USDA/NRICGP (no. 94-37300-0299 to J.L.B. and no. 93-37300-8769 to Z.A.).
Nucleic Acids Research
Pages
Introduction
Materials And Methods
Materials
Nuclear and matrix preparations
MAR binding experiments
MAR sequence analysis
Results
Identification and mapping of MARs in the Sh2/A1-homologous region of rice
Mapping MARs in the sorghum Sh2/A1-homologous region
Sequence characteristics of the rice and sorghum MARs
Discussion
Acknowledgements
References
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 23 Jan 1998
Copyright© Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Freeling, L. Rapaka, E. Lyons, B. Pedersen, and B. C. Thomas G-Boxes, Bigfoot Genes, and Environmental Response: Characterization of Intragenomic Conserved Noncoding Sequences in Arabidopsis PLANT CELL, May 1, 2007; 19(5): 1441 - 1457. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. C. Thomas, L. Rapaka, E. Lyons, B. Pedersen, and M. Freeling Arabidopsis intragenomic conserved noncoding sequence PNAS, February 27, 2007; 104(9): 3348 - 3353. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. C. Thomas, B. Pedersen, and M. Freeling Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes Genome Res., July 1, 2006; 16(7): 934 - 946. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Kellogg and J. L. Bennetzen The evolution of nuclear genome structure in seed plants Am. J. Botany, October 1, 2004; 91(10): 1709 - 1725. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rudd, M. Frisch, K. Grote, B. C. Meyers, K. Mayer, and T. Werner Genome-Wide in Silico Mapping of Scaffold/Matrix Attachment Regions in Arabidopsis Suggests Correlation of Intragenic Scaffold/Matrix Attachment Regions with Gene Expression Plant Physiology, June 1, 2004; 135(2): 715 - 722. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Langham, J. Walsh, M. Dunn, C. Ko, S. A. Goff, and M. Freeling Genomic Duplication, Fractionation and the Origin of Regulatory Novelty Genetics, February 1, 2004; 166(2): 935 - 945. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Langdon, G. Jenkins, R. Hasterok, R. N. Jones, and I. P. King A High-Copy-Number CACTA Family Transposon in Temperate Grasses and Cereals Genetics, March 1, 2003; 163(3): 1097 - 1108. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Witmer, R. Alvarez-Venegas, P. San-Miguel, O. Danilevskaya, and Z. Avramova Putative subunits of the maize origin of replication recognition complex ZmORC1-ZmORC5 Nucleic Acids Res., January 15, 2003; 31(2): 619 - 628. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Brouwer, W. Bruce, S. Maddock, Z. Avramova, and B. Bowen Suppression of Transgene Silencing by Matrix Attachment Regions in Maize: A Dual Role for the Maize 5' ADH1 Matrix Attachment Region PLANT CELL, September 1, 2002; 14(9): 2251 - 2264. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. V. Avramova Heterochromatin in Animals and Plants. Similarities and Differences Plant Physiology, May 1, 2002; 129(1): 40 - 49. [Full Text] [PDF] |
||||
![]() |
G. Morisawa, A. Han-yama, I. Moda, A. Tamai, M. Iwabuchi, and T. Meshi AHM1, a Novel Type of Nuclear Matrix-Localized, MAR Binding Protein with a Single AT Hook and a J Domain-Homologous Region PLANT CELL, October 1, 2000; 12(10): 1903 - 1916. [Abstract] [Full Text] |
||||
![]() |
T. Langdon, C. Seago, R. N. Jones, H. Ougham, H. Thomas, J. W. Forster, and G. Jenkins De Novo Evolution of Satellite DNA on the Rye B Chromosome Genetics, February 1, 2000; 154(2): 869 - 884. [Abstract] [Full Text] |
||||
![]() |
A. P. Tikhonov, J. L. Bennetzen, and Z. V. Avramova Structural Domains and Matrix Attachment Regions along Colinear Chromosomal Segments of Maize and Sorghum PLANT CELL, February 1, 2000; 12(2): 249 - 264. [Abstract] [Full Text] |
||||
![]() |
J. M. Greally, T. A. Gray, J. M. Gabriel, L. q. Song, S. Zemel, and R. D. Nicholls Conserved characteristics of heterochromatin-forming DNA at the 15q11-q13 imprinting center PNAS, December 7, 1999; 96(25): 14430 - 14435. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Rollini, S. J. Namciu, M. D. Marsden, and R. E. Fournier Identification and characterization of nuclear matrix-attachment regions in the human serpin gene cluster at 14q32.1 Nucleic Acids Res., October 1, 1999; 27(19): 3779 - 3791. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Frisch, K. Frech, A. Klingenhoff, K. Cartharius, I. Liebich, and T. Werner In Silico Prediction of Scaffold/Matrix Attachment Regions in Large Genomic Sequences Genome Res., February 1, 2002; 12(2): 349 - 354. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||










