Skip Navigation


Nucleic Acids Research Advance Access originally published online on December 14, 2006
Nucleic Acids Research 2007 35(2):559-571; doi:10.1093/nar/gkl1086
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (5728K) Freely available
Right arrow Screen PDF (1165K) Freely available
Right arrowOA All Versions of this Article:
35/2/559    most recent
gkl1086v2
gkl1086v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Li, Y. L. a. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, Y. L. a. S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2007, Vol. 35, No. 2 559-571
© 2006 The Author(s).
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Genomics

Genome-wide analyses of retrogenes derived from the human box H/ACA snoRNAs

Yuping Luo and Siguang Li*

College of Life Sciences, Nanchang University Nanchang 330047, People's Republic of China

*To whom correspondence should be addressed. Tel: +86 791 8304099; Fax: +86 791 8302703; Email: siguangli{at}163.com

Received August 21, 2006. Revised November 20, 2006. Accepted November 21, 2006.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUPPLEMENTARY DATA
 REFERENCES
 
The family of box H/ACA snoRNA is an abundant class of non-protein-coding RNAs, which play important roles in the post-transcriptional modification of rRNAs and snRNAs. Here we report the characterization in the human genome of 202 sequences derived from box H/ACA snoRNAs. Most of them were retrogenes formed using the L1 integration machinery. About 96% of the box H/ACA RNA-related sequences are found in corresponding locations on the chimpanzee and human chromosomes, while the mouse shares ~50% of these human sequences, suggesting that some of the H/ACA RNA-related sequences in primate occurred after the rodent/primate divergence. Of the H/ACA RNA-related sequences, 49% are found in intronic regions of protein-coding genes and 64 H/ACA-related sequences can be folded to the typical secondary structure of the box H/ACA snoRNA family, while 30 of them were recognized as functional homologs of their corresponding box H/ACA snoRNAs previously reported. Of the 64 sequences with the typical secondary structure of the box H/ACA RNA family, 11 were found in EST databases and 5 among which were shown to be expressed in more than one human tissue. Notably, U107f is nested in an intron of a protein gene coding for nudix-type motif 13, but expressed from the opposite strand, and the searching of EST databases revealed it can be expressed in liver and spleen, even in melanotic melanoma.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUPPLEMENTARY DATA
 REFERENCES
 
The family of box H/ACA RNA is an abundant class of non-protein-coding RNAs, which includes small nucleolar RNAs (snoRNAs), small Cajal body-specific RNAs (scaRNAs) (1), as well as, a homologous class of RNAs in archaeal organisms (2). Typical box H/ACA RNA exhibits a common hairpin–hinge–hairpin-tail secondary structure with the H (ANANNA) motif in the single-stranded hinge region and an ACA triplet located 3 nt upstream of the 3' termini (3). The majority of known box H/ACA RNAs play important roles in the post-transcriptional modification of rRNAs and snRNAs (4,5): the box H/ACA snoRNAs direct the conversion of uridine to pseudouridine at specific residues of eukaryotic ribosomal RNAs as well as Pol III-transcribed snRNA U6, whereas box H/ACA scaRNAs guide the formation of Pol II-transcribed spliceosomal nuclear RNA (snRNAs) {Psi}s (1). However, a few H/ACA RNAs are involved in rRNA processing, for example, U17, an evolutionarily conserved H/ACA snoRNA present in vertebrate, yeasts and the unicellular protozoan Tetrahymena thermophila (6), is involved in rRNA processing at the 5' end of 18S rRNA (7). Most likely, U17 functions as an RNA chaperone that safeguards the correct folding of 18S rRNA during pre-rRNA processing.

Recently, systematic experimental approaches and computational screening programs for H/ACA RNAs have been developed and numerous H/ACA RNAs have been detected in eukaryotes from yeast to human (815). In humans, ~100 H/ACA RNAs have been identified, and most of which are located within the introns of protein-encoding genes (16). Some H/ACA RNAs have several copies in different introns of the same genes (17,18), or within introns of different genes (19), suggesting redundant H/ACA RNAs appear to have arisen via duplication or transposition from existing H/ACA RNAs, but the ultimate origin of these RNAs is an open question.

In humans, retrotransposons of the long interspersed element-1 (L1) family and their remnants account for ~17% of the human genome (20,21). The enzymatic machinery of a retrotransposition-competent L1 predominantly transposes its own copies (22). However, L1s are capable of transposing other sequences, mostly Alu retroposons, but also cDNAs of different types of cellular RNAs (2325), thus forming retrogenes or retropseudogenes. The existence of an H/ACA retrogene, i.e. a non-autonomously transcribed H/ACA RNA-related sequence, was reported previously in the mouse genome (15), but no H/ACA retrogene was characterized in humans. Here we have identified 202 novel box H/ACA RNA-related sequences in the human genome, most of which are retrogenes. Sequence analyses suggest the involvement of the L1 retroposition machinery in the formation of human H/ACA RNA retrogenes. In addition, we found that the previously reported genes encoding ACA14a, ACA37, ACA41, ACA58, ACA59a, ACA59b, ACA63, ACA66, ACA67, ACA71a, ACA98b and U109 all appear to have resulted from retrotransposition events of H/ACA RNAs, suggesting retrotransposition mechanisms have played a pivotal role in the mobility and diversification of H/ACA RNA genes.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUPPLEMENTARY DATA
 REFERENCES
 
Computational search for H/ACA RNA-related genes in Homo sapiens
The sequences of human H/ACA sno/scaRNAs were taken from the snoRNA database (http://www-snorna.biotoul/fr). We used the megaBLAST tool on the NCBI website (http://www.ncbi.nlm.nih.gov/BLAST) to find box H/ACA RNA-related genes or pseudogenes on the human genome (NCBI build 36.1). The BLAST hits kept for further analysis contained at least 60% of the corresponding mature H/ACA RNA. H/ACA RNA-related sequences found in H.sapiens were retrieved with a 600 nt extension at each extremity and then searched for orthologs in chimpanzee genome (Pan troglodytes; NCBI build 1.1), mouse genome (mouse NCBI build 36.1) and other animal databases.

All H/ACA RNA-related genes or pseudogenes were mapped on human genome using BLAT search (http://genome.ucsc.edu/cgi-bin/hgBLAT).

Sequence identity analysis
All H/ACA RNA-related genes or pseudogenes were sequentially aligned with their corresponding H/ACA RNA gene sequence using Matcher (http://bioportal.cgb.indiana.edu/cgi-bin/emboss/matcher). The percentage of identities for each H/ACA RNA-related sequence compared with its corresponding H/ACA RNA gene was calculated.

Detection of chimeric retrogenes
To look for the eventuality of chimeric retrogenes, flanking regions of the H/ACA RNA-related sequences were sequentially aligned with the sequences of a number of other small non-protein-coding RNA species (e.g. tRNAs, snRNAs, miRNAs, rRNAs, etc.) and then investigated for repetitive elements with the RepeatMasker program (http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker).

Prediction of secondary structures of H/ACA RNA-related sequences
The secondary structures of all computationally identified H/ACA-related RNAs were derived using the mfold program (26); http://www.bioinfo.rpi.edu/applications/mfold/old/rna.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUPPLEMENTARY DATA
 REFERENCES
 
Identification of 202 box H/ACA RNA-related genes
Using a computational, genome-wide search strategy for extracting of human sequences with sequence similarities to various box H/ACA RNAs, we found 202 box H/ACA RNA-related sequences (Table 1) when requirements for >80% identity of sequence relative to at least 60% of the length of the corresponding RNA were set. The list of these sequences is appended as Supplementary data. We also searched chimpanzee and mouse genomes and found that ~96% of these human box H/ACA RNA-related genes exist in corresponding locations on the chimpanzee chromosomes, while mouse share ~50% of these human box H/ACA RNA-related sequences (data not shown). The distribution of numbers of different human box H/ACA RNA-related genes is strikingly skewed. U70 has the most copies at 21, ACA40 has the second-most at 13, while 13 H/ACA RNAs have only one copy of ACA-related gene each, and no H/ACA-related gene was found for 28 H/ACA genes.


View this table:
[in this window]
[in a new window]

 
Table 1 Box H/ACA RNA-related genes in human

 
These box H/ACA RNA-related sequences are not uniformly distributed on human chromosomes. There are 22 and 24 copies on chromosomes 1 and 2, respectively, however, no copy was found on chromosome Y and only two copies were found on chromosome 22, while chromosomes 5, 6, 7, 12, 17, 8 and X had some relative excess density of box H/ACA RNA-related genes. Of the 202 box H/ACA RNA-related genes found in the human genome, 99 (49%) located in intronic regions of protein-coding genes. Interestingly, eight of them were distributed on the antisense orientation of their host genes (Table 1). There were no significant differences between box H/ACA RNA-related genes located in introns and these located in intergenic regions in regard to sequence identity and sequence length (data not shown).

Most of the box H/ACA RNA-related genes are retrogenes
Careful analysis of the upstream and downstream region of these H/ACA snoRNA-related sequences, we found that of the 202 box H/ACA RNA-related genes found in this work, 182 (90%) probably correspond to H/ACA retrogenes (Table 1). All these retrogenes were flanked by direct repeats (target site duplications TSDs) of 7–17 nt, and most of them contained poly (A) tails at their 3' ends (Figure 1). Figure 1A shows a characteristic retrogene consisting of a 3' end poly(A) tail and of TSDs. In some cases, the H/ACA RNAs, each along with their original 5'- or 3'- flanking sequences, retrotransposed into a new location on the same or a different chromosome (Figure 1B and C), suggesting these H/ACA retrogenes resulted from somewhat stable H/ACA RNA processing intermediates in H/ACA biogenesis. However, some H/ACA RNA retrogenes originated when partially processed, exon-containing hnRNAs were reverse transcribed and inserted at new locations into the genome (Figure 1D and E), for example, the ACA40 gene hosted in the sixth intron of hypothetical protein gene MGC5306, a fragment of the MGC5306 gene including the host intron of ACA40 together with all 3'-exons, retrotransposed independently into chromosome 2 (ACA40b), chromosome 17 (ACA40c), chromosome 10 (ACA40d), chromosome 6 (ACA40e), chromosome 5 (ACA40i), chromosome 8 (ACA40j) and chromosome 5 (ACA40k).


Figure 1
Figure 1
View larger version (86K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1 Schematic representation of box H/ACA RNA retrogene examples. (A) The sequence below the scheme is retrogene U64b and 55 retrogenes belong to this type. (B) The sequence below the scheme is retrogene ACA10b and a number of retroposed nucleotides on the 5'-flanks and 5 retrogenes belong to this type. (C) The sequence below the scheme is retrogene ACA64c and a number of retroposed nucleotides on the 3'-flanks and 24 retrogenes belong to this type. (D) The sequence below the scheme is retrogene U70m and a number of retroposed nucleotides on the 3'-flanks and 25 retrogenes are similar to this case. (E) The sequence below the scheme is retrogene ACA40j and a number of retroposed nucleotides on the 3'-flanks and 12 retrogenes are similar to this case. The exon-derived sequences in (D) and (E) are shown in capital letters. (F) The sequence below the scheme is retrogene ACA7d and 6 retrogenes belong to this type. (G) The sequence below the scheme is retrogene ACA53c and a number of retroposed nucleotides on the 3'-flanks and 7 retrogenes belong to this type. (H) The sequence below the scheme is retrogene ACA36d and a number of retroposed nucleotides on the 3'-flanks and 4 retrogenes belong to this type. (I) 6 retrogenes belong to this type. The sequence below the scheme is retrogene ACA18e and a number of retroposed nucleotides on the 3' flanks. (J) The sequence below the scheme is retrogene HBI-61c and 1 retrogene belongs to this type. In all the cases, the H/ACA RNA sequences are in italics, retroposed nucleotides on the 3'- or 5'-flanks are in lower cases, Alu sequences are shaded, poly(A) and TSD are in opened and closed boxes, respectively. The L1 consensus recognition site (TTAAAA) is indicated at the 5' end and overlaid by a black bar in the examples.

 
Most of the retrogenes harbored at their 5' ends either a T2A4 hexanucleotide preferably recognized by L1 nicking endonuclease, or its derivatives with one or two single nucleotide substitutions (Figure 1A–E). These features suggest the involvement of the L1 retroposition machinery in the formation of the H/ACA retrogene. Notably, 39 (19%) of H/ACA RNA-related retrogenes were shortened at their 5' end (Table 1), presumably because of premature termination of the reverse transcription step. However, there are a few H/ACA RNA-related retrogenes without satisfactory L1 signature, which lack either a poly (A) tail (Figure 1F) or T2A4 target site overlapping a TSD (Figure 1G). The existence of tailless retrogenes were reported recently (27), suggesting a variant mechanism for the biogenesis of retrosequences.

Closer inspection of the H/ACA snoRNA-related retrogenes and their flanking sequences revealed that, in same cases, the H/ACA snoRNA-related retrogene had been disrupted by independent integration of an Alu element (Figure 1H). In these cases, allowing for virtual removal of the Alu insertion revealed a ‘repaired’ retrogene. In other cases, Alu sequence was inserted in the place between H/ACA RNA retrogene and the 3'-TSD (Figure 1I). This suggests that at these sites the H/ACA RNAs were inserted before the integration of the Alu elements. Interestingly, one chimeric retrogene composed of H/ACA sequence fused at its 3' termini with Alu element, was found (Figure 1J), which was probably formed due to template switching (28) from Alu RNA to H/ACA RNA during reverse transcription and then the fused transcript was integrated into the human genome. A number of retrogenes were reported to result from template switching, including those containing U6, 5S rRNA or 7SL rRNA fused at their 3' termini with Alu elements (24).

Some previously identified snoRNAs resulted from retrotransposition
Closer analysis of the upstream and downstream region of previously identified snoRNAs showed that ACA14a, ACA37, ACA41, ACA58, ACA59, ACA59b, ACA63, ACA66, ACA67, U71a, ACA98b and U109, are encoded by retrogenes (Figure 2). These box H/ACA RNAs were cloned from a HeLa cell extract immunoprecipitated with an anti-GAR1 antibody (18) or their expression were verified by Northern blot and primer extension (8,13,15). Clearly, these snoRNAs were formed by retrotransposition in the course of primate evolution, for example, the data obtained in this study suggest that the ACA63 gene originated as the result of retroposition of the ACA63b copy. First, ACA63b is found in corresponding locations on the human, chimpanzee and mouse genomes. Then, human and chimpanzee ATP2B4 and RERE genes encode ACA63 and another retrogene ACA63c in their introns, respectively, while the homologous genes of mouse are devoid of any ACA63-like sequence (Figure 3). Furthermore, comparison and alignment of the two loci ACA63/ACA63b from all available primate sequences revealed that the Otolemur garnettii ACA63 locus shows clean absence of the ACA63 along with its retroposed 3'-and 5'-flanking nucleotides (Supplementary Figure 1a). This convincing evidence indicates that human ACA63b that we found in this work is an evolutionary conserved snoRNA widely presented in vertebrates and retrotransposition of ACA63b occurred in primate after the rodent/primate divergence during the course of evolution. Interestingly, there are 4 ACA63c copies with obvious target site duplications (TSDs) in the chimp RERE gene, which probably resulted from a single retroposition event into this gene, followed by local segmental duplications.


Figure 2
View larger version (43K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2 Some previously reported H/ACA snoRNA genes with retrogene hallmarks. Schematic representation of the H/ACA RNA sequences. poly(A) and TSD are in open and closed boxes, respectively. The L1 consensus recognition site (TTAAAA) is indicated at the 5' end.

 


Figure 3
View larger version (18K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 3 Amplification of ACA63b snoRNA in primate. ACA63 sequence (small arrow) is located within an intron of the orthologous host genes. Additional copies (ACA63 and ACA63c) were generated in the primate lineage. Exons are represented by boxes. The cartoon is not drawn to scale. ATP2B4: ATPase, Ca++ transporting, plasma membrane 4. HTF9C: HpaII tiny fragments locus 9C. RANBP1: RAN binding protein 1. ZDHHC8: zinc finger, DHHC-type containing 8. RERE: arginine-glutamic acid dipeptide (RE) repeats.

 
In vertebrates, sequences encoding H/ACA are generally located in introns of their host gene, in the same orientation. So far, in vertebrates, an intron can carry only one snoRNA gene, but a host gene can carry several different snoRNA genes in different introns (16). The evolutionary analysis of H/ACA RNA genes within the introns of orthologous genes in six vertebrate species showed that a number of snoRNA genes in different introns of a host gene probably resulted from retrotransposition, for example, the H.sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus and Canis familiaris EIF4A2 gene orthologs host three snoRNA genes, HBI-61, E3 and ACA4 in different introns, respectively, while Gallus gallus only contains HBI-61 in its orthologous gene (Figure 4A). Similarly, the RPSA genes in all aforementioned mammals host two H/ACA genes, E2 and ACA6 in different introns; however, G.gallus is devoid of snoRNAs in the orthologous gene (Figure 4A). Notably, human and chimpanzee ACA4, E2 and E3 are flanked by TSD of >10 nt (data not shown). Although those TSD with a few nucleotide changes, one of these TSDs' ancestral states was present in the tenrec, Echinops telfairi ACA4 (Figure 4B), suggesting ACA4 and E3 in EIF4A2 and E2 in RPSA in mammal were resulted from retroposition after the mammal/aves divergence. In addition, there are some host genes which carry several paralogous snoRNA genes in different introns, such as in the TBRG4 gene (Figure 4A). The amplification of ACA5 in the host gene most likely did not occur via retroposition because insertions of retroposed sequences are virtually random and should not lead to accumulations in neighboring introns (11).


Figure 4
View larger version (34K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 4 Phylogenetic analysis of some H/ACA RNA genes. (A) Presence/absence of H/ACA RNA genes within the introns of orthologous host genes in six vertebrates. Each snoRNA sequence (small arrow) is located within an intron of the indicated genes. Exons are represented by boxes. The cartoon is not drawn to scale. EIF4A2: eukaryotic translation initiation factor 4A, isoform 2. RPSA: ribosomal protein SA. TBRG4: transforming growth factor beta regulator 4. (B) Retrogene ACA4 in Echinops telfairi. H/ACA RNA sequences are in italics, poly (A) and TSD are in opened and closed boxes, respectively.

 
Structures and expression of box H/ACA-related RNAs
Up to date, more than 100 H/ACA RNAs have been found in H.sapiens (16). In this study, we found at least two-thirds of these human H/ACA RNA genes have one or more related copies (Table 1). Remarkably, U70 has 21 related copies including six truncated sequences, and another snoRNA gene, U40, exhibits 13 related copies with six truncated sequences. Alignments of these novel H/ACA RNA-related sequences with their orthologs previously reported revealed numerous sequence changes, including small insertions or deletions, which occurred frequently in less important regions, and occasionally in the conserved elements such as box H and ACA. Despite showing sequence variation to some extent, out of 202 box H/ACA RNA-related sequences, 64 can be folded to the typical secondary structure of the box H/ACA RNA family, i.e. the hairpin–hinge–hairpin–tail structure (Supplementary Figure 2), among which 30 were recognized as functional homologs of their corresponding box H/ACA RNAs previously reported according to the relationship between the structure and function of snoRNA, while the remainder did not show any complementarity to either rRNAs or snRNAs due to the sequence diversification and therefore were recognized as orphan H/ACA RNAs.

Retroposition generated for most box H/ACA RNA genes additional copies, quite a number might be functional. Due to cross-hybridization in Northern blot analysis, it could not be assessed if all the 64 box H/ACA RNA-related sequences with typical features of the box H/ACA RNA family are indeed expressed in human tissues. Therefore, we performed BLAST searches of all the 64 box H/ACA RNA-related sequences against EST databases and found that of 11, the corresponding ESTs were detected in EST databases and 5 were shown to be expressed in more than one human tissue (Table 2). Of course, identification of ESTs is not necessarily an indication for the presence of processed and functional snoRNAs. Notably, U107f is located in an intron of a protein gene coding for nudix (nucleoside diphosphate linked moiety X)-type motif 13, but expressed from the opposite strand (Figure 5) and EST database searches revealed that it can be expressed in liver and spleen, even in melanotic melanoma (Table 2). It is not clear whether U107f has a functional role as an antisense regulator for the expression of the protein-coding gene.


Figure 5
View larger version (5K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 5 Genomic location of U107f in H.sapiens. SnoRNA genes are shown by black arrows, protein-coding genes by non-filled and gray arrows (not drawn to scale). The length of intergenic spacers is also indicated.

 


View this table:
[in this window]
[in a new window]

 
Table 2 Box H/ACA RNA-related genes expressed in human tissues detected in EST databases

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUPPLEMENTARY DATA
 REFERENCES
 
We have identified in the human genome databases 202 novel box H/ACA RNA-related sequences 0–20% diverged from their corresponding genes reported previously and belonging to 61 box H/ACA RNA types (Table 1), which shows that most human box H/ACA RNA have multiple copies. In contrast to Arabidopsis and rice, where many snoRNAs are found in multiple copies mainly resulting from two different mechanisms: large chromosomal duplications and small tandem duplications producing polycistronic genes (29), human multiple box H/ACA copies mainly result from retroposition. Out of 202 box H/ACA RNA-related sequences identified in this work, 182 have the typical structures of retrogene, and the figure of H/ACA retrogene seems to be underestimated, inasmuch as retrogenes >20% diverged from their corresponding genes are not included in our analysis.

The genomes of the chimpanzee and man share ~96% of box H/ACA RNA-related sequences at identical locations, and only ~4% are thus hominin-specific, having arisen in our genome since the divergence from chimpanzee. On the contrary, the genomes of the mouse contains only ~50% box H/ACA RNA-related sequences relative to man and some sequences were found in different genomic regions, suggesting that most of the H/ACA RNA-related sequences in primate occurred after the rodent/primate divergence. To elucidate the mechanism of H/ACA snoRNA propagation in primates, we analyzed all ape-specific events (those duplicated in human and chimp but not in rhesus monkey) using presence/absence patterns, and found that among nine ape-specific events (ACA1b, ACA10b, ACA40g, ACA40n, ACA43b, ACA51b, ACA57b, ACA64c and U67c), all but one originated from retroposition (Supplementary Figure 1C), suggesting that duplications of most H/ACA snoRNAs in primates are indeed bona fide events mediated by retroposition. In addition, retroposition of different H/ACA RNAs occurred at different stage of primate evolution (Supplementary Figure 1). Notably, the sequence of human-specific retrogene ACA59b is completely identical to ACA59, pointing to a very recent origin of the snoRNA retrogene ACA59b and suggesting, that retrotransposition of snoRNAs still continues to the present day in the human lineage.

Multiple studies have suggested a high rate of retroposition on the primate and rodent lineages (3032), probably driven by the activity of L1 retrotransposable elements (33). Our results also show the involvement of the L1 retroposition machinery in the formation of human H/ACA retrogenes. Retroposition was commonly thought to generate nonfunctional gene copies (retropseudogenes) that accumulate disablements such as premature stop codons and frameshift mutations for protein-coding genes (34), because the copied mRNA is generally lacking regulatory elements. However, Brosius (35,36) predicted that retrogenes can insert next to resident promoter/enhancer elements and thus escape transcriptional silencing. Indeed, researchers have recently shown that retroposition has generated a significant number of new functional genes (retrogenes) in mammalian genomes (37,38). Similarly, some of the retrogenes derived from H/ACA RNAs appear to be functional genes. First, nearly 50% H/ACA retrogenes found in this work are intronic, encoded within protein-coding genes. Like previously identified intronic snoRNAs (3941), intronic retrogenes can be co-transcribted with their host genes and then released from excised, debranched introns by exonucleolytic trimming. Furthermore, unlike protein-coding genes, snoRNA retrogenes do not accumulate disablements such as premature stop codons and frameshift mutations. Importantly, some snoRNA retrogenes, even when located in the antisense orientation to their host gene (ACA107f) or in intergenic region (ACA64c), have typical H/ACA RNA structure and can be expressed in human tissues. In addition, for some H/ACA genes retroposition generated more copies and the process may also have provided abundant raw material for the formation of new genes. Therefore it appears that retroposition is one of the ways of novel snoRNA gene formation. In line with the notion, some previously reported box H/ACA RNA genes apparently resulted from retrotransposition of different box H/ACA RNAs (Figures 24).


    SUPPLEMENTARY DATA
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUPPLEMENTARY DATA
 REFERENCES
 
Supplementary data are available at NAR online.


    ACKNOWLEDGEMENTS
 
The authors thank Donggen Zhou for help with the analysis of secondary structures of RNA. This work was supported by China National Science Foundation 30660042. Funding to pay the Open Access publication charges for this article was provided by the Key Laboratory of Biochemistry and Molecular Biology of Jiangxi Province, China.

Conflict of interest statement. None declared.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUPPLEMENTARY DATA
 REFERENCES
 

  1. Darzacq, X., Jady, B.E., Verheggen, C., Kiss, A.M., Bertrand, E., Kiss, T. (2002) Cajal body-specific small nuclear RNAs: a novel class of 2'-O-methylation and pseudouridylation guide RNAs EMBO J, . 21, 2746–2756[CrossRef][Web of Science][Medline] .

  2. Tang, T.H., Bachellerie, J.P., Rozhdestvensky, T., Bortolin, M.L., Huber, H., Drungowski, M., Elge, T., Brosius, J., Huttenhofer, A. (2002) Identification of 86 candidates for small non-messenger RNAs from the archaeon Archaeoglobus fulgidus Proc. Natl Acad. Sci. USA, 99, 7536–7541[Abstract/Free Full Text] .

  3. Ganot, P., Bortolin, M.L., Kiss, T. (1997) Site-specific pseudouridine formation in eukaryotic pre-rRNAs is guided by small nucleolar RNAs Cell, 89, 799–809[CrossRef][Web of Science][Medline] .

  4. Bachellerie, J.P., Cavaille, J., Huttenhofer, A. (2002) The expanding snoRNA world Biochimie, 84, 775–790[Medline] .

  5. Kiss, T. (2002) Small nucleolar RNAs: an abundant group of noncoding RNAs with diverse cellular functions Cell, 109, 145–148[CrossRef][Web of Science][Medline] .

  6. Atzorn, V., Fragapane, P., Kiss, T. (2004) U17/snR30 is a ubiquitous snoRNA with two conserved sequence motifs essential for 18S rRNA production Mol. Cell. Biol, . 24, 1769–1778[Abstract/Free Full Text] .

  7. Mishra, R.K. and Eliceiri, G.L. (1997) Three small nucleolar RNAs that are involved in ribosomal RNA precursor processing Proc. Natl Acad. Sci. USA, 94, 4972–4977[Abstract/Free Full Text] .

  8. Schattner, P., Barberan-Soler, S., Lowe, T.M. (2006) A computational screen for mammalian pseudouridylation guide H/ACA RNAs RNA, 12, 15–25[Abstract/Free Full Text] .

  9. Torchet, C., Badis, G., Devaux, F., Costanzo, G., Werner, M., Jacquier, A. (2005) The complete set of H/ACA snoRNAs that guide rRNA pseudouridylations in Saccharomyces cerevisiae RNA, 11, 928–938[Abstract/Free Full Text] .

  10. Schattner, P., Decatur, W.A., Davis, C.A., Ares, M., Jr, Fournier, M.J., Lowe, T.M. (2004) Genome-wide searching for pseudouridylation guide snoRNAs: analysis of the Saccharomyces cerevisiae genome Nucleic Acids Res, . 32, 4281–4296[Abstract/Free Full Text] .

  11. Zemann, A., op de Bekke, A., Kiefmann, M., Brosius, J., Schmitz, J. (2006) Evolution of small nucleolar RNAs in nematodes Nucleic Acids Res, . 34, 2676–2685[Abstract/Free Full Text] .

  12. Huttenhofer, A., Kiefmann, M., Meier-Ewert, S., O'Brien, J., Lehrach, H., Bachellerie, J.P., Brosius, J. (2001) RNomics: an experimental approach that identifies 201 candidates for novel, small, non-messenger RNAs in mouse EMBO J, . 20, 2943–2953[CrossRef][Web of Science][Medline] .

  13. Gu, A.D., Zhou, H., Yu, C.H., Qu, L.H. (2005) A novel experimental approach for systematic identification of box H/ACA snoRNAs from eukaryotes Nucleic Acids Res, . 33, e194[Abstract/Free Full Text] .

  14. Li, S.G., Zhou, H., Luo, Y.P., Zhang, P., Qu, L.H. (2005) Identification and functional analysis of 20 Box H/ACA small nucleolar RNAs (snoRNAs) from Schizosaccharomyces pombe J. Biol. Chem, . 280, 16446–16455[Abstract/Free Full Text] .

  15. Vitali, P., Royo, H., Seitz, H., Bachellerie, J.P., Huttenhofer, A., Cavaille, J. (2003) Identification of 13 novel human modification guide RNAs Nucleic Acids. Res, . 31, 6543–6551[Abstract/Free Full Text] .

  16. Lestrade, L. and Weber, M.J. (2006) snoRNA-LBME-db, a comprehensive database of human H/ACA and C/D box snoRNAs Nucleic Acids. Res, . 34, D158–D162[Abstract/Free Full Text] .

  17. Ganot, P., Caizergues-Ferrer, M., Kiss, T. (1997) The family of box ACA small nucleolar RNAs is defined by an evolutionarily conserved secondary structure and ubiquitous sequence elements essential for RNA accumulation Genes Dev, . 11, 941–956[Abstract/Free Full Text] .

  18. Kiss, A.M., Jady, B.E., Bertrand, E., Kiss, T. (2004) Human box H/ACA pseudouridylation guide RNA machinery Mol. Cell. Biol, . 24, 5797–5807[Abstract/Free Full Text] .

  19. Pelczar, P. and Filipowicz, W. (1998) The host gene for intronic U17 small nucleolar RNAs in mammals has no protein-coding potential and is a member of the 5'-terminal oligopyrimidine gene family Mol. Cell. Biol, . 18, 4509–4518[Abstract/Free Full Text] .

  20. Ostertag, E.M. and Kazazian, H.H., Jr. (2001) Biology of mammalian L1 retrotransposons Annu. Rev. Genet, . 35, 501–538[CrossRef][Web of Science][Medline] .

  21. Kazazian, H.H., Jr. (2004) Mobile elements: drivers of genome evolution Science, 303, 1626–1632[Abstract/Free Full Text] .

  22. Wei, W., Gilbert, N., Ooi, S.L., Lawler, J.F., Ostertag, E.M., Kazazian, H.H., Boeke, J.D., Moran, J.V. (2001) Human L1 retrotransposition: cis preference versus trans complementation Mol. Cell. Biol, . 21, 1429–1439[Abstract/Free Full Text] .

  23. Esnault, C., Maestre, J., Heidmann, T. (2000) Human LINE retrotransposons generate processed pseudogenes Nature Genet, . 24, 363–367[CrossRef][Web of Science][Medline] .

  24. Buzdin, A., Gogvadze, E., Kovalskaya, E., Volchkov, P., Ustyugova, S., Illarionova, A., Fushan, A., Vinogradova, T., Sverdlov, E. (2003) The human genome contains many types of chimeric retrogenes generated through in vivo RNA recombination Nucleic Acids Res, . 31, 4385–4390[Abstract/Free Full Text] .

  25. Perreault, J., Noel, J.F., Briere, F., Cousineau, B., Lucier, J.F., Perreaultm, J.P., Boire, G. (2005) Retropseudogenes derived from the human Ro/SS-A autoantigen-associated hY RNAs Nucleic Acids Res, . 33, 2032–2041[Abstract/Free Full Text] .

  26. Zuker, M. (2003) Mfold web server for nucleic acid folding and hybridization prediction Nucleic Acids Res, . 31, 3406–3415[Abstract/Free Full Text] .

  27. Schmitz, J., Churakov, G., Zischler, H., Brosius, J. (2004) A novel class of mammalian-specific tailless retropseudogenes Genome Res, . 14, 1911–1915[Abstract/Free Full Text] .

  28. Brosius, J. (1999) Genomes were forged by massive bombardments with retroelements and retrosequences Genetica, 107, 209–238[CrossRef][Web of Science][Medline] .

  29. Barneche, F., Gaspin, C., Guyot, R., Echeverria, M. (2001) Identification of 66 box C/D snoRNAs in Arabidopsis thalana: Extensive gene duplications generated multiple isoforms predicting new ribosomal RNA 2'-O-methyltion sites J. Mol. Biol, . 311, 57–73[CrossRef][Web of Science][Medline] .

  30. Zhang, Z., Harrison, P.M., Liu, Y., Gerstein, M. (2003) Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome Genome Res, . 13, 2541–2558[Abstract/Free Full Text] .

  31. Ohshima, K., Hattori, M., Yada, T., Gojobori, T., Sakaki, Y., Okada, N. (2003) Wholegenome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates Genome Biol, . 4, R74[CrossRef][Medline] .

  32. Zhang, Z., Carriero, N., Gerstein, M. (2004) Comparative analysis of processed pseudogenes in the mouse and human genomes Trends Genet, . 20, 62–67[CrossRef][Web of Science][Medline] .

  33. Esnault, C., Maestre, J., Heidmann, T. (2000) Human LINE retrotransposons generate processed pseudogenes Nat. Genet, . 24, 363–367[CrossRef][Web of Science][Medline] .

  34. Mighell, A.J., Smith, N.R., Robinson, P.A., Markham, A.F. (2000) Vertebrate pseudogenes FEBS Lett, . 468, 109–114[CrossRef][Web of Science][Medline] .

  35. Brosius, J. (1991) Retroposons—seeds of evolution Science, 251, 753[Free Full Text] .

  36. Brosius, J. (1999) RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements Gene, 238, 115–134[CrossRef][Web of Science][Medline] .

  37. Emerson, J.J., Kaessmann, H., Betran, E., Long, M. (2004) Extensive gene traffic on the mammalian X chromosome Science, 303, 537–540[Abstract/Free Full Text] .

  38. Vinckenbosch, N., Dupanloup, I., Kaessmann, H. (2006) Evolutionary fate of retroposed gene copies in the human genome Proc. Natl Acad. Sci. USA, 103, 3220–3225[Abstract/Free Full Text] .

  39. Tycowski, K.T., Shu, M.D., Steitz, J.A. (1993) A small nucleolar RNA is processed from an intron of the human gene encoding ribosomal protein S3 Genes Dev, . 7, 1176–1190[Abstract/Free Full Text] .

  40. Kiss, T. and Filipowicz, W. (1993) Small nucleolar RNAs encoded by introns of the human cell cycle regulatory gene RCC1 EMBO J, . 12, 2913–2920[Web of Science][Medline] .

  41. Kiss, T. and Filipowicz, W. (1995) Exonucleolytic processing of small nucleolar RNAs from pre-mRNA introns Genes Dev, . 9, 1411–1424[Abstract/Free Full Text] .


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Genome ResHome page
J. Schmitz, A. Zemann, G. Churakov, H. Kuhl, F. Grutzner, R. Reinhardt, and J. Brosius
Retroposed SNOfall--A mammalian-wide comparison of platypus snoRNAs
Genome Res., June 1, 2008; 18(6): 1005 - 1010.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
J. L. Goodier, L. Zhang, M. R. Vetter, and H. H. Kazazian Jr.
LINE-1 ORF1 Protein Localizes in Stress Granules with Other RNA-Binding Proteins, Including Components of RNA Interference RNA-Induced Silencing Complex
Mol. Cell. Biol., September 15, 2007; 27(18): 6469 - 6483.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
R. Tanaka-Fujita, Y. Soeno, H. Satoh, Y. Nakamura, and S. Mori
Human and mouse protein-noncoding snoRNA host genes with dissimilar nucleotide sequences show chromosomal synteny
RNA, June 1, 2007; 13(6): 811 - 816.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (5728K) Freely available
Right arrow Screen PDF (1165K) Freely available
Right arrowOA All Versions of this Article:
35/2/559    most recent
gkl1086v2
gkl1086v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Li, Y. L. a. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, Y. L. a. S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?