Nucleic Acids Research Advance Access originally published online on September 18, 2007
Nucleic Acids Research 2007 35(19):6350-6356; doi:10.1093/nar/gkm723
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2007, Vol. 35, No. 19 6350-6356
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Computational Biology |
Organismal complexity, cell differentiation and gene expression: human over mouse
Institute of Cytology, Russian Academy of Sciences, Tikhoretsky Avenue 4, St. Petersburg 194064, Russia
*To whom correspondence should be addressed. Tel: +7-812-2971452; Fax: +7-812-2970341; Email: aevin{at}mail.cytspb.rssi.ru
Received July 2, 2007. Revised August 12, 2007. Accepted September 1, 2007.
| ABSTRACT |
|---|
|
|
|---|
We present a molecular and cellular phenomenon underlying the intriguing increase in phenotypic organizational complexity. For the same set of human–mouse orthologous genes (11 534 gene pairs) and homologous tissues (32 tissue pairs), human shows a greater fraction of tissue-specific genes and a greater ratio of the total expression of tissue-specific genes to housekeeping genes in each studied tissue, which suggests a generally higher level of evolutionary cell differentiation (specialization). This phenomenon is spectacularly more pronounced in those human tissues that are more directly involved in the increase of complexity, longevity and body size (i.e. it is reflected on the organismal level as well). Genes with a change in expression breadth show a greater human–mouse divergence of promoter regions and encoded proteins (i.e. the functional genomics data are supported by the structural analysis). Human also shows the higher expression of translation machinery. The upstream untranslated regions (5'UTRs) of human mRNAs are longer than mouse 5'UTRs (even after correction for the difference in genome sizes) and contain more uAUG codons, which suggest a more complex regulation at the translational level in human cells (and agrees well with the augmented cell specialization).
| INTRODUCTION |
|---|
|
|
|---|
The growth of organizational complexity (called sometimes progress) is an intriguing and probably most important biological and social phenomenon. The measures and the driving forces of this process are still debated (1–12). A prominent feature of progressive evolution in biological and social systems is the increase of division of labor among system elements (1–5). In multicellular organisms, it is reflected in cell differentiation accompanied by enhancement of specialized cell function and appearance of tissue-specific genes with the corresponding increase in the number of different cell types in the organism. The number of cell types (or other organismal parts), which reflects the degree of division of labor, is an agreeable indicator of biological complexity (3,4,6,11). However, the count of morphologically discernable cell types is ambiguous in higher multicellulars (6,11). Furthermore, there can be cell types that are not distinguishable morphologically. Similarly, no universal (suitable for all cell types) measure of cell differentiation is known. We introduce here a general molecular-level indicator that might be suitable for this purpose: the fraction of tissue-specific genes and the ratio of the total expression of tissue-specific genes to the total expression of housekeeping genes. We studied the genome-wide differences in expression of orthologous genes between human and mouse from this perspective and found that human has the higher cell differentiation in each studied homologous tissue. This genomic and cellular phenomenon is spectacularly more pronounced in those human tissues that are more directly involved in the increase of complexity, longevity and body size (i.e. it is reflected on the organismal level as well).
The preliminary results on the fraction of housekeeping genes (using a much lower number of homologous genes and tissues) were reported for the older versions of human and mouse microarray platforms (10). It should also be noted that human–mouse comparison of gene expression was performed previously in a number of works but from other angles [e.g. (13–17), and references therein].
| MATERIALS AND METHODS |
|---|
|
|
|---|
The data on expression of human and mouse genes were taken from the Novartis Gene Expression Atlas (14). They present the results of high-density oligonucleotide microarray experiments performed uniformly for all tissues. The uniform platforms were used for all human (U133A + GNF1B) and mouse (GNF1M) tissues. The signals from probes on the chip corresponding to the same gene were averaged; the samples representing the same tissue were also averaged. Only probes that presented the characterized genes, i.e. with links to the Entrez Gene (18) and RefSeq (19) databases, and that were presumably orthologous between human and mouse were used. The orthology was established using the HomoloGene database (18) (11 534 gene pairs with reciprocal best hits were found). Only normal tissues that were homologous for human and mouse were used (32 tissues, listed in Figure 1).
|
The modern microarrays show good reproducibility and portability across platforms (20–22). Even if there can be problems in regard to individual genes, the phenomena described in the present article are based on the expression of hundreds and thousands of genes. Furthermore, discrepancies in regard to individual genes are usually due to poorly designed probes (because of incorrect gene annotations, i.e. poorly known genes) (20,22), whereas we used only well-characterized genes (those with links to Entrez Gene, RefSeq and HomoloGene databases). To prove the main phenomena, we used very diverse tests (see below). Moreover, the preliminary results on the fraction of housekeeping genes were obtained with the older human and mouse platforms (U95A and U74A) standardized with the older algorithm (10). Thus, the effect is reproducible and portable across platforms. It also should be noted that putative artifacts due to variation in gene GC content [such as higher stability or affinity of GC-rich RNAs (23,24)], if any, would act in the direction opposite to the revealed effect because housekeeping genes are more GC-rich than tissue-specific ones, and this difference is higher in human compared to mouse (10). Furthermore, all RNAs samples used in the Gene Expression Atlas were obtained by the same procedure from the frozen tissues, and the quality of all samples was checked with the Agilent Bioanalyzer (14). As for the probe affinity, the effect of GC content is unlikely in oligonucleotide microarrays because the so-called mismatch probe used as control contains the same 25-nt sequence as the perfect match probe, except for one central nucleotide (i.e. it has nearly the same GC content) (25).
For discerning housekeeping and tissue-specific genes, we used both the cutoff-based criteria of expression detection (with different cutoff values, as indicated in Table 1) and the Affymetrix calls (presence/absence) provided in the Gene Expression Atlas. The detection of gene expression with cutoff value equal to dataset median was recommended by the authors of the Gene Expression Atlas; it is based on the extensive PCR-validation of oligonucleotide microarray data (13–15). The Affymetrix call for a given gene is a local feature, which is not depended on microarray scaling (i.e. on signal from probes for other genes) (25). Also, we used various parameters of the among-tissues distribution of expression signal, which are related to the degree of tissue-specificity (and which were used previously for other purposes or introduced here): entropy (26), ratio of maximum to average value (27), index of tissue-specificity (16,28), skew (distribution of tissue-specific genes should be more right-skewed), coefficient of variation (should be higher for tissue-specific genes), first absolute central moment (should be higher for tissue-specific genes), ratio of the total expression of tissue-specific genes to the total expression of housekeeping genes.
|
For comparison with the novel exon microarray platforms, the gene expression data available for six homologous human–mouse tissues were taken from the Exon Array dataset normalized with the GeneBASE program (17). The tissues were heart, kidney, liver, muscle, spleen and testis. (For comparison with the Gene Expression Atlas, lymphnode was taken in the Gene Expression Atlas as analogous of spleen in the Exon Array dataset.) There were 8229 genes common for the Gene Expression Atlas and the Exon Array dataset. We used the definition of housekeeping genes obtained with the Gene Expression Atlas (because it contains the much larger number of tissues) to analyze the expression of housekeeping and non-housekeeping genes in six tissues of the Exon Array dataset. [The cutoff median expression, as recommended in refs (13–15), was used for definition of housekeeping genes. The results were similar for definition based on the Affymetrix calls.]
For human–mouse comparison of gene promoter regions, we extracted their sequences from the database of experimentally determined exact transcriptional start sites (DBTSS) (29). The comparison was made as described (30). Briefly, we first masked these sequences for lineage-specific repeats (that were inserted after the human–mouse split) using the standalone RepeatMasker and DateRepeats programs (Smit,A.F.A., Hubley,R., Green,P., http://repeatmasker.org). Then, the matching of human–mouse promoter regions (200 nt-long sequences upstream transcription start site) was done using a very rigorous Huang–Miller algorithm for local sequence alignment, implemented in the Lalign program (31). (The significance level for spurious match was set to a conservative threshold P < 10–6.)
The analysis of functional gene modules was done as described (32). Briefly, we checked the average human–mouse difference of a tested parameter for a predefined gene set against the average difference for the total dataset. The predefined gene sets were prepared using Gene Ontology (GO) categories (33), and pathways compilations from KEGG (34) and Reactome (35) databases (using Entrez Gene mapping), and HumanCyc (36). In the case of Gene Ontology categories, we collected for each category all its subcategories using GO graphs, and a gene was regarded as belonging to a given category if it was mapped to any of its subcategories in Entrez Gene. For estimation of significance level, we did 20 000 random samplings from the total dataset (of a size equal to the size of a tested gene group). After obtaining two-tailed significance level (P-value), we estimated false discovery rate (q-value) for correction for multiple comparisons (37). As the main parameter, the (log-transformed) expression level averaged among all tissues was used.
For analysis of upstream untranslated regions (5'UTR), the human and mouse mRNAs were extracted from the RefSeq database (19). To ensure the completeness of 5'UTRs, we used the database of experimentally determined exact transcriptional start sites (DBTSS) (29). We used only those 5'UTRs whose lengths were non-zero in both species and equal (for a given species) in both databases. There were 1578 orthologous gene pairs whose 5'UTRs of processed mRNAs satisfied these conditions. (If zero-length 5'UTRs were included in the analysis, the results were similar.) If there were several mRNAs for a gene, the longest 5'UTR was taken. (The shortest and the average 5'UTR lengths were also compared; the results were qualitatively the same.) For correction for the difference in genome sizes, the lengths of mouse 5'UTRs were multiplied by factor of 1.16 (38).
To compare nucleotide composition between silent sites of coding DNA and background, we calculated chi-squares of 4-fold degenerate third codon positions using intronic and nearby intergenic sequences for estimation of background nucleotide probabilities. In a variant of analysis, introns were preliminarily masked for (human or mouse) lineage-specific repeats. (This was done because repeat-insertion is a special mode of mutation pressure, which might change nucleotide composition and which is suppressed in coding regions.) Then, we compared the obtained chi-squares in pairwise way for human–mouse orthologous genes (using the Mann–Whitney test).
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Evolutionary cell differentiation
The conclusion about the difference in evolutionary cell differentiation between two mammals may seem surprising. Therefore, we checked it extensively using the diverse tests (both dependent and independent of expression threshold criteria). By any expression threshold criterion, there are
3-fold more orthologous genes expressed in all studied tissues (i.e. housekeeping genes) in the mouse than in the human (Table 1). Both the mean and the median of among-tissues breadth of expression (number of tissues where a given gene is expressed) are also higher in the mouse (Table 1). Similarly, other parameters (which are independent of expression threshold criteria) show the consistent differences indicating a relative bias towards housekeeping genes in mouse [entropy (26)] or towards tissue-specific genes in human (ratio of maximum to average expression value, index of tissue-specificity, skew, coefficient of variation, first absolute central moment; see Methods section) (Table 1). Moreover, we found changes at the structural level in those genes that changed their expression breadth after the human–mouse split. Thus, the human–mouse identity of encoded protein sequence (compared at amino acid level) is lower in those genes that are housekeeping in mouse but non-housekeeping in human compared to genes that are housekeeping in both species (0.890 ± 0.003 versus 0.933 ± 0.006, Mann–Whitney P < 10–12). Similarly, those genes that are housekeeping in human but non-housekeeping in mouse show the lower protein sequence identity than genes that are housekeeping in both species (0.898 ± 0.010 versus 0.933 ± 0.006, P < 10–7). [The cutoff median expression, as recommended in refs (13–15), was used for definition of housekeeping genes. The results were similar for definition based on the Affymetrix calls.]
Furthermore, the human–mouse conservation of nucleotide sequence in the promoter region (200 nt upstream transcription start site) is lower in those genes that are housekeeping in mouse but non-housekeeping in human compared to genes that are housekeeping in both species (for the length of conserved sequence multiplied by its human–mouse identity: 70.5 ± 2.7 versus 87.7 ± 5.2, Mann–Whitney P < 10–7). Similarly, those genes that are housekeeping in human but non-housekeeping in mouse show the lower conservation of promoter regions than genes that are housekeeping in both species (66.7 ± 6.7 versus 87.7 ± 5.2, P < 10–5). These facts suggest that the revealed changes in gene expression breadth are not due to some artifact of the functional genomics dataset because they correlate with the corresponding alterations in the structure of gene regulatory regions.
Strength of the effect in different tissues
For comparing different tissues, we used a combined parameter, which estimated both the among-tissues breadth and the level of expression: the ratio of the total expression of tissue-specific (non-housekeeping) genes to the total expression of housekeeping genes in a tissue (Table 1). Noteworthy, this parameter allows the estimation of the relative expression of tissue-specific genes in a tissue using the universal internal standard for all tissues (expression of the same set of housekeeping genes), thus removing possible problems with the variation of mRNAs amounts in different tissues, among-samples data standardization, etc. We used all non-housekeeping genes as tissue-specific genes because of the following reasons. First, on the histogram of genes expressed in each particular tissue there is no clear-cut peak of tissue-specific genes, only the peak of housekeeping genes and the plateau of genes with gradually changing tissue-specificity [Figure 2B in ref. (10); the picture was similar with the present dataset]. Therefore, any other threshold would be arbitrary. Second, genes with the intermediate expression breadth (i.e. expressed in more than one tissue but less than all tissues) show the highest informational load and probably contribute in a greater extent to multicellular complexity (39). Third, the uniqueness of gene expression pattern is determined by all non-housekeeping genes, which are expressed in a given tissue.
The ratio of the total expression of tissue-specific genes to housekeeping genes is higher in human compared to mouse in all studied homologous tissues (Figure 1: all red circles are above unity). This phenomenon is more pronounced in the nervous system (except for olfactory bulb), skeletal muscle, kidney and heart (Figure 1). In regard to the nervous system, the relation to organismal complexity is obvious. Of special interest are the trigeminal ganglion (TGG) and the dorsal root ganglion (DRG), where the effect is even higher than in the other parts of the nervous system. The TGG where the ophthalmic, maxillary and mandibular nerves converge is involved in visual information processing. Also, it controls face and mouth movements (speech), which are so important for human social organization. The DRG contains cell bodies of incoming sensory fibers from the rest of the body. Among other things, the precise finger activity, which is so important for human, depends on the DRG. Thus, the prominent human–mouse differences in these two tissues might reflect not only the increase in organismal complexity but also the transition to the qualitatively higher complexity level: social organization (TGG) and tool making (DRG). The amygdala is involved in emotional learning. The cerebellum is responsible for complex movement in 3D space (especially important for primates) and, in particular, in the maintenance of posture.
The higher cell differentiation in human skeletal muscle might be related to the larger body size [because of allometry of muscle strength to body mass known since Galileo (11)] and the upright posture. The heart is important for increased body size, upright posture and longevity. The heart is a known bottleneck in human constitution, which limits lifespan and works near the ceiling of its capability (32). The higher cell differentiation in human kidney is probably related to increased longevity (maintenance of homeostasis). The kidney disease is among the leading causes of human death (40), which indicates that human kidney works near the limit of its capability.
Of special interest is the olfactory bulb where the effect is below the median of studied tissues (and especially weaker if compared to other parts of the nervous system) (Figure 1). It is well known that rodents are olfactory specialists while primates are visual specialists (41). In other words, the case of olfactory bulb is just that kind of exception that confirms the rule.
Comparison with the Exon Array dataset
There is now the data on gene expression in a few homologous human–mouse tissues obtained with the novel exon microarrays (designed mainly for studying alternative splicing) (17). We determined the ratio of the total expression of tissue-specific (non-housekeeping) genes to the total expression of housekeeping genes for six tissues provided in the Exon Array dataset (heart, kidney, liver, muscle, spleen and testis) (17). In consistence with the results described in the previous section, these ratios were higher in all human tissues compared to mouse ones. The human–mouse ratio of these ratios was even seemingly higher than that obtained using the Gene Expression Atlas but did not differ significantly (2.51 ± 0.23 versus 2.18 ± 0.48, Mann–Whitney P > 0.1). Also, there is a good correlation between the among-tissues rankings of these ratios in the Gene Expression Atlas and the Exon Array dataset (with the exception of one outlier—testis). If testis was excluded, Pearson r = 0.94, P < 0.02 (for log-transformed ratios), Spearman r = 1.0, P < 10–4. The exception of testis might be related to a very high level of alternative splicing in this tissue (42).
Functional gene modules
A surprising (and most pronounced) difference on the modular level is the higher expression of human translation machinery (Table 2). Probably, it is not due to more intensive protein synthesis because human has a lower metabolic rate with a lower protein turnover (43). Then, how could the higher expression of human translation machinery be explained?
|
More complex regulation at translational level: a hypothesis
The regulation of gene expression at the level of translation is an emerging theme now (44–46). Noteworthy, it endows local sites with independent decision-making authority, which is especially important for highly specialized cells with complex cellular architectures (45). Most of the translational regulatory mechanisms are inhibitory (44,46). Therefore, the more complex regulation at translational level should involve the more developed translation machinery. This situation is similar with the regulation at transcriptional level where the increase in complexity (e.g. after transition from prokaryotes to eukaryotes) was associated with the enlargement of the genome and a general reduction of transcription rate because of mostly suppressive regulatory means (e.g. chromatin condensation), which switch-off genes whose expression should not be allowed in a given cell. The untranslated regions of processed mRNAs, especially located upstream (5'UTRs), are known to serve for translational regulation (44,46–49). The longer 5'UTRs usually occur in those genes that should be more strongly and finely controlled (47). The AUG (potentially start) codons in the 5'UTRs (named uAUGs) are also involved in translational regulation and generally attenuate protein synthesis (49). If human has a more complex regulation at the translational level, it should be reflected in the structure of human mRNAs.
We found that human 5'UTRs are on average 37% longer than the corresponding mouse ones (difference of log-transformed lengths in orthologous genes compared in pairwise way: 0.138 ± 0.024, Mann–Whitney P < 10–12). This difference remains significant after correction for 16% difference in genome sizes (0.073 ± 0.024, P < 10–7). The latter fact suggests that the difference in 5'UTR lengths cannot be just due to mutation pressure (if one assumes that the difference in genome sizes was caused by neutral drift), and should have functional significance. The number of uAUGs is 30% greater in human compared to mouse (difference of counts compared in pairwise way: 0.152 ± 0.081, P < 10–5).
The recent data suggest that synonymous codon usage can be involved in translational regulation in mammals (50–52). If translational regulation is more complex in human, the nucleotide composition of silent sites in human coding sequences (presumably determined by translation-related selection) might differ stronger from background nucleotide composition (assumingly determined by mutation pressure) compared to mouse. We calculated chi-squares of 4-fold degenerate third codon positions against intronic and nearby intergenic sequences (used for estimation of background nucleotide probabilities) and found that these chi-squares are higher in human (differences for human–mouse orthologous genes compared in pairwise way: for intergenic sequences, 16.4 ± 1.5, Mann–Whitney P < 10–12; for intronic sequences, 11.8 ± 0.7, P < 10–12; for intronic sequences masked for human- or mouse-specific repeats, 12.6 ± 0.7, P < 10–12).
| CONCLUSION |
|---|
|
|
|---|
For the same set of human–mouse orthologous genes and homologous tissues, human shows a greater fraction of tissue-specific genes. From the neutralist standpoint, the main difference between human and mouse is in effective population size, which may result in a weaker purifying selection in human. However, its effect (if any) should be in the direction opposite to our finding. Thus, a relaxed purifying selection could lead to the higher non-functional gene expression, i.e. higher transcriptional background noise (e.g. 53,54). This would result in a seemingly higher fraction of broadly expressed genes in human compared to mouse (opposite to what we found). Therefore, we believe that the selectionist interpretation related to human–mouse difference in complexity is more likely. It seems that the higher complexity of human exists not only on the phenotypic organismal level but also on the genomic and cellular levels.
In any studied tissue, human has a greater ratio of the total expression of tissue-specific genes to the total expression of housekeeping genes, which indicates the intensification of specialized cell function. How the homologous cells of two mammals (having the similar sets of organs) can differ in the degree of specialization? The biochemical diversification (specialization) of cells of seemingly the same type (hepatocytes) within the complex organ architecture was reported even for such relatively homogenous organ as the liver (albeit it may not be discerned morphologically) (55,56). The degree of such specialization can be higher in human organs. Thus, there can be a higher number of different biochemical cell types (determined by uniqueness of gene expression pattern).
On the organismal level, there is a striking correlation of the strength of this phenomenon with a type of tissue. It is more pronounced in those tissues that are more directly involved in the increase of organizational complexity, longevity and body size. Noteworthy, human cells are more resistant to transformation compared to mouse cells (57,58), i.e. the greater human longevity can indeed be reflected on the cellular level [to say nothing of the questionable Hayflick limit (59)].
The higher expression of human translation machinery is in good agreement with the longer 5'UTRs of human mRNAs and the greater number of uAUG codons, which suggest a more complexly regulated translation. The translational regulation can be more important for the more specialized cells because it allows a more rapid and economical (and localized within the cell with complex architecture) response to stimuli, albeit in a narrower range (specified by the current transcriptome), compared to regulation at the level of transcription. (As an utmost example, compare the nucleus-free mammalian erythrocytes, which do not have transcriptional regulation at all, with the nucleated erythrocytes of lower vertebrates.)
Because of universality of the principle of division of labor in the progressive evolution of biological and social systems, the level of cell differentiation (estimated by the relative number and expression of tissue-specific versus housekeeping genes) might become a general indicator of multicellular organismal complexity. In a practical sense, as mouse is a paramount model for biomedical research, it is important to understand its principal differences from human.
| ACKNOWLEDGEMENTS |
|---|
We thank the anonymous reviewers for helpful comments. This work was supported by the Russian Foundation for Basic Research (RFBR). The Open Access publication charges for this article were waived by Oxford University Press.
Conflict of interest statement. None declared.
| REFERENCES |
|---|
|
|
|---|
- Smith A. The Wealth of Nations (1776) Edinburgh.
- Spencer H. Essays: Scientific, Political, and Speculative (1891) 1. London: Williams and Norgate.
- Bonner JT. The Evolution of Complexity (1988) Princeton, NJ: Princeton University Press.
- McShea DW. Metazoan complexity and evolution: Is there a trend? Evolution (1996) 50:477–492.[CrossRef][Web of Science]
- Gould SJ. The Structure of Evolutionary Theory (2002) Cambridge, Massachusetts: Harvard University Press.
- Carroll SB. Chance and necessity: the evolution of morphological complexity and diversity. Nature (2001) 409:1102–1109.[CrossRef]
- McShea DW. The hierarchical structure of organisms: a scale and documentation of a trend in the maximum. Paleobiology (2001) 27:405–423.
[Abstract/Free Full Text] - Szathmary E, Jordan F, Pal C. Can genes explain biological complexity? Science (2001) 292:1315–1316.
[Free Full Text] - Wolfe KH. Yesterday's polyploids and the mystery of diploidization. Nat. Rev. Genet (2001) 2:333–341.[CrossRef][Web of Science][Medline]
- Vinogradov AE. Isochores and tissue-specificity. Nucleic Acids Res (2003) 31:5212–5220.
[Abstract/Free Full Text] - Bonner JT. Perspective: the size-complexity rule. Evolution (2004) 58:1883–1890.[CrossRef][Web of Science][Medline]
- Taft RJ, Pheasant M, Mattick JS. The relationship between non-protein-coding DNA and eukaryotic complexity. Bioessays (2007) 29:288–299.[CrossRef][Web of Science][Medline]
- Su AI, Cooke MP, Ching KA, Hakak Y, Walker JR, Wiltshire T, Orth AP, Vega RG, Sapinoso LM, et al. Large-scale analysis of the human and mouse transcriptomes. Proc. Natl Acad. Sci. USA (2002) 99:4465–4470.
[Abstract/Free Full Text] - Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl Acad. Sci. USA (2004) 101:6062–6067.
[Abstract/Free Full Text] - Yang J, Su AI, Li WH. Gene expression evolves faster in narrowly than in broadly expressed mammalian genes. Mol. Biol. Evol (2005) 22:2113–2118.
[Abstract/Free Full Text] - Liao BY, Zhang J. Low rates of expression profile divergence in highly expressed genes and tissue-specific genes during mammalian evolution. Mol. Biol. Evol (2006) 23:1119–1128.
[Abstract/Free Full Text] - Xing Y, Ouyang Z, Kapur K, Scott MP, Wong WH. Assessing the conservation of mammalian gene expression using high-density exon arrays. Mol. Biol. Evol (2007) 24:1283–1285.
[Abstract/Free Full Text] - Maglott D, Ostell J, Pruitt KD, Tatusova T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res (2005) 35:D26–D31.
- Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res (2007) 35:D61–D65.
[Abstract/Free Full Text] - Larkin JE, Frank BC, Gavras H, Sultana R, Quackenbush J. Independence and reproducibility across microarray platforms. Nat. Methods (2005) 2:337–344.[CrossRef][Web of Science][Medline]
- Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, Gabrielson E, Garcia JG, Geoghegan J, et al. Multiple-laboratory comparison of microarray platforms. Nat. Methods (2005) 2:345–350.[CrossRef][Web of Science][Medline]
- Quackenbush J, Irizarry RA. Response to Shields: MIAME, we have a problem. Trends Genet (2006) 22:471–472.[CrossRef][Web of Science][Medline]
- Margulies EH, Kardia SL, Innis JW. Identification and prevention of a GC content bias in SAGE libraries. Nucleic Acids Res (2001) 29:E60.[CrossRef][Medline]
- Barreau C, Paillard L, Osborne HB. AU-rich elements and associated factors: are there unifying principles? Nucleic Acids Res (2006) 33:7138–7150.
[Abstract/Free Full Text] - Affymetrix inc. GeneChip Expression Analysis. Data Analysis Fundamentals. (2004).
- Schug J, Schuller WP, Kappen C, Salbaum JM, Bucan M, Stoeckert CJ Jr. Promoter features related to tissue specificity as measured by Shannon entropy. Genome Biol (2005) 6:R33.[CrossRef][Medline]
- Vinogradov AE. Dualism of gene GC content and CpG pattern in regard to expression in the human genome: magnitude versus breadth. Trends Genet (2005) 21:639–643.[CrossRef][Web of Science][Medline]
- Yanai I, Benjamin H, Shmoish M, Chalifa-Caspi V, Shklar M, Ophir R, Bar-Even A, Horn-Saban S, Safran M, et al. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics (2005) 21:650–659.
[Abstract/Free Full Text] - Yamashita R, Suzuki Y, Wakaguri H, Tsuritani K, Nakai K, Sugano S. DBTSS: DataBase of Human Transcription Start Sites, progress report 2006. Nucleic Acids Res (2006) 34:D86–D89.
[Abstract/Free Full Text] - Vinogradov AE. "Genome design" model: evidence from conserved intronic sequence in human-mouse comparison. Genome Res (2006) 16:347–354.
[Abstract/Free Full Text] - Pearson WR. Flexible similarity searching with the FASTA3 program package. In: Bioinformatics Methods and Protocols—Misener S, Krawetz SA, eds. (1999) Totowa, NJ: Humana Press. 185–219.
- Anatskaya OV, Vinogradov AE. Heart and liver as developmental bottlenecks of mammal design: evidence from cell polyploidization. Biol. J. Linn. Soc (2004) 83:175–186.[CrossRef][Web of Science]
- The Gene Ontology Consortium. The Gene Ontology (GO) project in 2006. Nucleic Acids Res (2006) 34:D322–D326.
[Abstract/Free Full Text] - Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res (2006) 34:D354–D357.
[Abstract/Free Full Text] - Vastrik I, DEustachio P, Schmidt E, Joshi-Tope G, Gopinath G, Croft D, de Bono B, Gillespie M, Jassal B, et al. Reactome: a knowledgebase of biological pathways and processes. Genome Biol (2007) 8:R39.[CrossRef][Medline]
- Romero P, Wagg J, Green ML, Kaiser D, Krummenacker M, Karp PD. Computational prediction of human metabolic pathways from the complete human genome. Genome Biol (2005) 6:R2.[CrossRef][Medline]
- Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA (2003) 100:9440–9445.
[Abstract/Free Full Text] - Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, et al. Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 420:520–562.[CrossRef][Medline]
- Vinogradov AE. Genome design model and multicellular complexity: golden middle. Nucleic Acids Res (2006) 34:5906–5914.
[Abstract/Free Full Text] - Eknoyan G, Lameire N, Barsoum R, Eckardt KU, Levin A, Levin N, Locatelli F, MacLeod A, Vanholder R, et al. The burden of kidney disease: improving global outcomes. Kidney Int (2004) 66:1310–1314.[CrossRef][Web of Science][Medline]
- Ache BW, Young JM. Olfaction: diverse species, conserved principles. Neuron (2005) 48:417–430.[CrossRef][Web of Science][Medline]
- Elliott DJ, Grellscheid SN. Alternative RNA splicing regulation in the testis. Reproduction (2006) 132:811–819.
[Abstract/Free Full Text] - Hulbert AJ, Else PL. Basal metabolic rate: history, composition, regulation, and usefulness. Physiol. Biochem. Zool (2004) 77:869–876.[CrossRef][Web of Science][Medline]
- Gebauer F, Hentze MW. Molecular mechanisms of translational control. Nat. Rev. Mol. Cell Biol (2004) 5:827–835.[CrossRef][Web of Science][Medline]
- Kindler S, Wang H, Richter D, Tiedge H. RNA transport and local control of translation. Annu. Rev. Cell Dev. Biol (2005) 21:223–245.[CrossRef][Web of Science][Medline]
- Richter JD, Sonenberg N. Regulation of cap-dependent translation by eIF4E inhibitory proteins. Nature (2005) 433:477–480.[CrossRef][Medline]
- Mignone F, Gissi C, Liuni S, Pesole G. Untranslated regions of mRNAs. Genome Biol (2002) 3. REVIEWS0004.
- Iacono M, Mignone F, Pesole G. uAUG and uORFs in human and rodent 5'untranslated mRNAs. Gene (2005) 349:97–105.[CrossRef][Web of Science][Medline]
- Churbanov A, Rogozin IB, Babenko VN, Ali H, Koonin EV. Evolutionary conservation suggests a regulatory function of AUG triplets in 5'-UTRs of eukaryotic genes. Nucleic Acids Res (2005) 33:5512–5520.
[Abstract/Free Full Text] - Chamary JV, Parmley JL, Hurst LD. Hearing silence: non-neutral evolution at synonymous sites in mammals. Nat. Rev. Genet (2006) 7:98–108.[CrossRef][Web of Science][Medline]
- Nackley AG, Shabalina SA, Tchivileva IE, Satterfield K, Korchynskyi O, Makarov SS, Maixner W, Diatchenko L. Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science (2006) 314:1930–1933.
[Abstract/Free Full Text] - Kimchi-Sarfaty C, Oh JM, Kim IW, Sauna ZE, Calcagno AM, Ambudkar SV, Gottesman MM. A "silent" polymorphism in the MDR1 gene changes substrate specificity. Science (2007) 315:525–528.
[Abstract/Free Full Text] - Bird AP. Gene number, noise reduction and biological complexity. Trends Genet (1995) 11:94–100.[CrossRef][Web of Science][Medline]
- Hurst LD. Evolutionary genetics. The silence of the genes. Curr. Biol (1995) 5:459–461.[CrossRef][Web of Science][Medline]
- Benhamouche S, Decaens T, Godard C, Chambrey R, Rickman DS, Moinard C, Vasseur-Cognet M, Kuo CJ, Kahn A, et al. Apc tumor suppressor gene is the zonation-keeper of mouse liver. Dev. Cell (2006) 10:759–770.[CrossRef][Web of Science][Medline]
- Braeuning A, Ittrich C, Kohle C, Hailfinger S, Bonin M, Buchmann A, Schwarz M. Differential gene expression in periportal and perivenous mouse hepatocytes. FEBS J (2006) 273:5051–5061.[CrossRef][Medline]
- Rangarajan A, Weinberg RA. Comparative biology of mouse versus human cells: modelling human cancer in mice. Nat. Rev. Cancer (2003) 3:952–959.[CrossRef][Web of Science][Medline]
- Rangarajan A, Hong SJ, Gifford A, Weinberg RA. Species- and cell type-specific requirements for cellular transformation. Cancer Cell (2004) 6:171–183.[CrossRef][Web of Science][Medline]
- Rubin H. The disparity between human cell senescence in vitro and lifelong replication in vivo. Nat. Biotechnol (2002) 20:675–681.[CrossRef][Web of Science][Medline]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
