| Nucleic Acids Research | Pages |
Improved microbial gene identification with GLIMMER
Introduction
Methods And Algorithms
Markov Models
The interpolated context model
Resolving overlapping genes
Computational Methods
Conclusion
Acknowledgements
References
Improved microbial gene identification with GLIMMER
Received July 7, 1999; Revised and Accepted October 17, 1999
ABSTRACT The GLIMMER system for microbial gene identification finds ~97-98% of all genes in a genome when compared with published annotation. This paper reports on two new results: (i) significant technical improvements to GLIMMER that improve its accuracy still further, and (ii) a comprehensive evaluation that demonstrates that the accuracy of the system is likely to be higher than previously recognized. A significant proportion of the genes missed by the system appear to be hypothetical proteins whose existence is only supported by the predictions of other programs. When the analysis is restricted to genes that have significant homology to genes in other organisms, GLIMMER misses <1% of known genes.
INTRODUCTION
Accurate microbial gene identification is becoming ever more important with the increasing rate of whole genome sequencing projects. In the past year alone, eight new bacterial and archaeal genomes have appeared, and the pace continues to accelerate. Each new genome contains thousands of new genes, all of which are deposited into public databases. These genes then become the basis for much further research into the biology of these organisms, and their sequences are used for further biological study. For work such as microarray analysis, in which specific sequences are arrayed onto a substrate and used as probes to measure expression levels, the accuracy of gene predictions is critical. The same point can be made about knockout experiments, which are an important tool to use in determining the function of the large numbers of genes whose function is unknown at the time of publication. Such hypothetical proteins typically comprise 30-40% of the genes in a newly sequenced genome.
GLIMMER 1.0 is a computational gene finder that finds 97-98% of all genes in a prokaryotic genome without any human intervention (1). The system can be quickly and easily trained using only the genome sequence of interest. The technical under-pinning of the system is an interpolated Markov model (IMM), a generalization of Markov chain methods. GLIMMER 1.0 has been used as the gene finder for Borrelia burgdorferi (2), Treponema pallidum (3), Chlamydia trachomatis (4) and Thermotoga maritima (5), and the software is in use at over 100 laboratories and institutes. Below we describe the algorithm and performance results of GLIMMER 2.0, a gene finder that incorporates several technical improvements to the GLIMMER 1.0 algorithm. As a result of these improvements, GLIMMER 2.0 has slightly higher sensitivity than GLIMMER 1.0 and is much better at resolving overlapping gene calls. The latter property is especially useful for genomes such as Deinococcus radiodurans, which due to their high GC-content have numerous long open reading frames (ORFs) that can easily lead to predictions of genes whose boundaries overlap incorrectly.
METHODS AND ALGORITHMS
We begin by briefly reviewing Markov models in the context of DNA sequence analysis. We then describe the probabilistic model used in GLIMMER 2.0 to identify regions that are likely to be genes. We then describe how GLIMMER 2.0 resolves conflicts when overlapping genes are predicted. The complete GLIMMER 2.0 system is available from The Institute for Genomic Research at http://www.tigr.org/softlab
Markov Models
A Markov chain is a sequence of random variables Xi, where the probability distribution for each Xi depends only on the preceding k variables Xi-1, ..., Xi-k, for some constant k. For DNA sequence analysis, a Markov chain models the probability of a given base b as depending only on the k bases immediately prior to b in the sequence. We refer to these preceding k bases as the context of base b in the sequence. The most common type of Markov chain is a fixed-order chain, in which the entire k-base context is used at every position. For example, a fixed 5th-order Markov chain model of DNA sequences comprises 45 = 1024 probability distributions, one for each possible 5mer context. Such fixed 5th-order models have proven effective at gene prediction in bacterial genomes (6,7).
Ideally, larger values for k are always preferable. Unfortunately, because the training data available for building models is limited, we must limit k. In most collections of DNA coding sequences, however, there is substantial variability in the frequency of occurrence of different kmers.
IMMs are a generalization of fixed-order Markov chains that combine contexts of different lengths to compute the probability of base b. Our formulation allows each context to have a weight based in part on its frequency; this allows the IMM to be sensitive to how common a particular oligomer is in a given genome. In particular, rare kmers should not be used for prediction; the IMM will ignore these in favor of shorter Markov chains. On the other hand, some long kmers may occur very frequently, and for those the IMM can give the longer context more weight and make a better prediction. These weights define an interpolated probability distribution that incorporates information from multiple Markov chains. An IMM can emulate a fixed kth-order chain simply by setting all weights to zero except for those associated with k.
Details of how to construct an IMM for sequence data have been described previously (1). For coding regions, GLIMMER 1.0 builds three separate IMMs, one for each codon position. [This is known as a 3-periodic Markov model (6).] These IMMs include 0-8th order Markov chains, as well as weights computed for every oligomer of eight bases or less that appears in the training data. These weights and Markov models are interpolated to produce a score for each base in any potential coding sequence. The logs of these scores are summed to score each coding region.
The interpolated context model
Interpolated context models (ICMs) are a further extension of IMMs. For a given context C = b1b2 ... bk of length k, the IMM in GLIMMER 1.0 computes a probability distribution for bk+1 using as many of the bases immediately preceding bk+1 as the training data set allows. The ICM is more flexible and can select any of the bases in C (not just those adjacent to bk+1) to determine the probability of bk+1. In general, from a given context, the ICM will choose approximately the same number of bases as the IMM. Our motivation for choosing bases other than those at the end of the context is the fact that in coding regions the significance of a given base depends strongly on its position in a codon; e.g. the nucleotide in the third codon position is sometimes irrelevant to the amino acid translation.
The criterion employed by the ICM to select which bases of a context C to use is mutual information. The mutual information between a given pair of discrete random variables X and Y is defined to be:
![]() |
where xi and yj are the values taken by random variables X and Y respectively, and P(xi, yj) is the joint probability of xi and yj together.
To construct an ICM with context length k from a training set T of DNA sequences, we begin by considering all windows (i.e. oligomers) of length k+1 that occur in T. We let random variable X1 be the distribution of bases in the first position of those windows; X2 be the distribution of bases in the second position; and so on through Xk+1. We then calculate the mutual information values I(X1; Xk+1), I(X2; Xk+1), ..., I(Xk; Xk+1), and choose the maximum. Suppose that maximum is I(Xj; Xk+1). We then partition our set of windows into four subsets based on the nucleotide that occurs in position j in the window.
The same procedure can now be performed again for each of the four sets of windows. Within each set, the position that has the highest mutual information with the base at position k+1 is chosen. The four nucleotide values at that position induce a further partitioning of the current set of windows into four subsets.
This process can be viewed as constructing a tree of positions within context strings. A sample portion of such a tree is shown in Figure 1. The construction is terminated when the tree depth reaches a predetermined limit, or when the size of a set of windows becomes too small to be useful to estimate the probability of the last base position.
Figure 1. Sample ICM decomposition tree. The root position 12 has maximum mutual information with the final base position 13. Each child of the root represents the subset of windows with the indicated nucleotide value at position 12, and indicates the maximum mutual information position for that subset. Each node is similarly decomposed into children. Note that children of a single node may represent different base positions.
Each node in the ICM decomposition tree represents a set of windows that provide a probability distribution for the final base position. The root node, which includes all possible windows, represents a 0th-order Markov model. All other nodes give a probability distribution for the final base position, conditional on a specific set of bases occurring at the positions indicated on the path to the root from that node.
Note that the IMM used in GLIMMER 1.0 is a special case of this ICM, namely the case where the base chosen at each level of the tree is the last available base in the context window. Thus, when the nearest positions to base bk+1 provide the strongest evidence for its value, the ICM automatically chooses them and the result is identical to the IMM. But when other bases provide stronger evidence, as is often the case, the ICM will choose them instead.
The interpolation mechanism used in the ICM is identical to that used in GLIMMER 1.0. It takes a weighted sum of two probability distributions, where the weights are determined by the number of training instances used to construct the distribution and its statistical significance as measured by a [chi]2 test. The only difference is that the ICM interpolation is naturally viewed as interpolating between the distributions at a parent and child node in the tree, while the IMM interpolation is always between distributions obtained using different numbers of bases at the end of the context window.
The interpolated context model presented here is essentially a probabilistic decision tree, i.e. a sparse probability distribution expressed as a decision tree. The tree construction is identical to constructing classification trees using information gain as the splitting criteria (8). Classification trees associate a class label with each leaf node of the tree. The labels in our case are the four nucleotide values, and our interpolated context model determines a probability distribution for the base to be predicted given the context in which it occurs. Probabilistic decision trees have been designed for other applications (9-11). In computational biology probabilistic decision trees have been used for modeling splice site junctions (12) and exon modeling (13).
Resolving overlapping genes
In developing GLIMMER 2.0, a conscious effort was made to reduce the number of false negative gene predictions at the expense of a slight increase in the number of false positive predictions. Upon close examination of GLIMMER 1.0s output, we learned that occasionally a gene was discarded because its start codon was positioned too far in the 5[prime] direction, resulting in substantial overlap with another gene. GLIMMER 2.0 solves this problem by incorporating additional rules to resolve such overlaps.
In GLIMMER 1.0, when two potential genes A and B overlap, the overlap region is scored. If A is longer than B, and if A scores higher on the overlap region, and if moving B's start site will not resolve the overlap, then B is rejected.
In GLIMMER 2.0, when potential genes A and B overlap, the overlap region is scored just as in GLIMMER 1.0. The system attempts to move the locations of the start codons much more aggressively, as follows. Suppose gene A scores higher, now four different orientations are considered:
![]() |
In this case, postponing the start site of either A or B does not remove the overlap. If A is significantly longer than B (as determined by a program parameter), then B is rejected. Otherwise, both A and B are called genes, with an annotation that there was a doubtful overlap.
![]() |
Only moving the start of B can resolve the overlap. If it can be moved, then it is. If not, and if B is significantly shorter than A, then B is rejected. Otherwise, both are listed as genes, with a note indicating the overlap. Moving a start codon works as follows: the system shortens the predicted gene by shifting the start location to the next available start codon. If this does not resolve the overlap, it moves the start codon again. This process continues as long as the resulting gene is longer than the minimum gene length (an easily adjustable parameter).
![]() |
Only moving the start of A can resolve the overlap. Since A scores higher, we only try to move it if the overlap is a relatively small fraction of A's length. If adjusting A is not successful, B is rejected.
![]() |
Both starts can move. We first move the start of B until the overlap region scores higher for B. Then we move the start of A until it scores higher. Then B again, and so on, until either the overlap is eliminated or no further moves can be made.
An additional step is taken by GLIMMER 2.0 to help find genes that previously were missed because the score from the independent probability model was too high. The independent probability model is used by both versions of the system to compete against the IMMs used to score all six reading frames; its purpose is to serve as a model of non-coding DNA. In order to be called a gene, an ORF must score higher than the independent model as well as the other five reading frames. Genes that were missed due to high scores from this independent model will fall in between the genes predicted by GLIMMER 1.0. For a target ORF in such regions, GLIMMER 2.0 considers the scores on subsequences of that ORF as compared to other overlapping ORFs. If these subsequences receive sufficiently high scores, and if the ORF scores relatively high in relation to the independent model (even though it did not exceed the normal score threshold to be called a gene), then it is added to the list of prospective genes.
The process of evaluating overlaps in GLIMMER 2.0 is performed in an iterative fashion in order to avoid rejecting genes unnecessarily. For example, in the case where ORF A causes ORF B to be rejected, and B in turn causes C to be rejected, we wish to reject only B and not both B and C. Thus, we perform the rejection phase in multiple stages, first discarding B and then checking again for overlaps.
COMPUTATIONAL METHODS
We analyzed 10 completed microbial genomes: Haemophilus influenzae (14), Mycoplasma genitalium (15), Methanococcus jannaschii (16), Helicobacter pylori (17), Escherichia coli (18), Bacillus subtilis (19), Archaeoglobus fulgidus (20), B.burgdorferi (2), T.pallidum (3) and T.maritima (5). On each of the genomes, we ran both GLIMMER 1.0 and GLIMMER 2.0. All parameters were the defaults, although adjusting these default settings will improve performance on selected genomes. The training data was identical in every case in order to ensure a fair comparison.
The method of training was as follows: using only the genome itself as input, we extracted all ORFs longer than 500 bp from each genome. From these long ORFs, only those that did not overlap other long ORFs were retained; this produces a set of ORFs that are highly likely to be coding. (The programs to perform this extraction are included in the GLIMMER package; total runtime is <1 min on a standard desktop PC.) For all genomes in this study, this set contains more than enough data to train the system accurately.
Next, the IMM training was conducted using the original GLIMMER 1.0 program and the new, tree-structured ICMs for GLIMMER 2.0. These models were then used to identify genes in the complete genome. For all genomes, ranging in size from 0.5 to 4.7 Mb, training GLIMMER 1.0 or GLIMMER 2.0 takes <1 min on a Pentium 400 PC running the Linux operating system. The gene finding step takes an additional 1 min or less.
The results of the comparison are summarized in Tables 1-4. In all 10 genomes, there are only 12 confirmed annotated genes that GLIMMER 1.0 found that GLIMMER 2.0 did not. In all these results, we have not discounted gene predictions that fall into known ribosomal RNA or tRNA regions. Since such regions are easy to identify independently of GLIMMER, this step should be a routine part of any annotation process.
Table 1. A comparison of the number of genes correctly found by GLIMMER 1.0 and GLIMMER 2.0 for 10 complete genomes
A second set of experiments was designed to find the true accuracy of GLIMMER. In the original study (1), GLIMMER 1.0's gene calls were compared to the published annotation for several completed genomes. The results of this study showed that GLIMMER 1.0 was able to find 97-98% of annotated genes fully automatically, using neither database searches nor human intervention; however, published annotation is not 100% accurate. Therefore the question remains open as to how accurate these predictions really are. This second experiment is an attempt to answer that question more precisely.
In order to measure accuracy more precisely, we extracted a subset of genes from the published annotation for each genome. These subsets include only those genes that have significant homology to known proteins, as indicated in the published annotation. Many of these genes have a functional assignment, but some are homologous to other genes of unknown function (these are sometimes annotated as `conserved hypothetical' proteins). We included the latter in the experiment because the existence of homology itself is very strong evidence that the sequence encodes a protein. Except for the use of only a subset of annotated genes, all other details of the experiments were the same as for Table 1. The results of this second comparison are summarized in Table 2.
Table 2. The number of genes with database matches found by GLIMMER 1.0 and GLIMMER 2.0 for 10 complete genomes
Database matches include genes that match genes with unknown function, known as `conserved hypotheticals', as well as genes whose function is known. (Thanks to Alain Viari for testing GLIMMER on B.subtilis. The 1249 genes listed in the third column for B.subtilis were selected according to an even stricter criterion than having a database match; these are the genes that already had been documented in the literature prior to the completion of the B.subtilis genome project.)
The results make it clear that GLIMMER is more accurate on genes confirmed by sequence homology than it is on the remaining genes. For GLIMMER 1.0, sensitivity ranges from 98.4 to 99.7%, with an average of 99.1%. For GLIMMER 2.0, the range is 98.6-99.8%, with an average of 99.3%. In contrast, GLIMMER 1.0's average accuracy on the complete set of annotated genes for all 10 genomes is 98.1%, and GLIMMER 2.0's average on those genes is 98.6%.
Table 3 contains a summary of how the `confirmed' (or conserved) genes differ from the hypothetical genes in the 10 genomes used in this study. On average, the hypothetical genes are considerably shorter and have ~2% lower GC-content. These data are consistent with the hypothesis that these hypothetical genes contain a significant number of non-coding regions that were mistakenly annotated as coding. (For example, the presence of stop codons alone lowers the average GC-content of non-coding regions.) Most hypothetical gene annotations are based primarily on the predictions of computational systems. The fact that GLIMMER is more accurate on conserved genes is suggestive that the hypothetical predicted genes missed by GLIMMER are the result of simple disagreement between two computational gene finders.
Table 3. Differences between the length and GC-content of genes that are conserved in other organisms versus `hypothetical' genes
The disproportionately small number of conserved genes for B.subtilis reflects the fact that this set includes only those genes that were identified experimentally prior to the completion of the genome sequence.
In each of the 10 genomes, GLIMMER 2.0 found more conserved genes than GLIMMER 1.0. Usually the number was very small, only 1-5 genes for eight of the genomes. However, the set of conserved genes found by GLIMMER 2.0 was not a strict superset of those found by GLIMMER 1.0. We intersected the two sets and compared them in order to identify which genes were found by both systems and which were found exclusively by one or the other. These results are shown in Table 4. As the table shows, for each genome there are 0-4 genes found by GLIMMER 1.0 and missed by GLIMMER 2.0. There are three genomes, M.genitalium, M.jannaschii and A.fulgidus, in which all conserved genes found by GLIMMER 1.0 are found also by GLIMMER 2.0. Typically, genes found by GLIMMER 1.0 but not found by GLIMMER 2.0 are relatively short and score just below the minimum scoring threshold. For example, in B.burgdorferi the gene found by GLIMMER 1.0 and not by GLIMMER 2.0 is a 74-amino-acid ribosomal protein S14 (BB0491). The GLIMMER 2.0 score for this gene was 88, just below the default threshold value of 90. Such genes could be included in GLIMMER 2.0's predictions with suitable parameter adjustments, although at a cost of additional false-positive predictions.
Table 4. Numbers of genes confirmed by database matches found exclusively by GLIMMER 1.0, by GLIMMER 2.0, and by both systems
The columns labeled `Additional' show how many additional genes are uniquely predicted by each of the two systems respectively. Thus for H.influenzae, GLIMMER 1.0 predicts 49 genes that GLIMMER 2.0 does not, one of which has database homology. Likewise, GLIMMER 2.0 predicts 62 genes that GLIMMER 1.0 does not, two of which have database matches. They agree on 1494 (out of 1501) gene predictions with database homology.
In order to demonstrate that GLIMMER 2.0 has a higher sensitivity than alternative gene-finding methods, we analyzed a recently sequenced genome, Mycobacterium tuberculosis strain H37Rv (21), for which GLIMMER 2.0 was not among the computational methods used for annotation. Table 5 summarizes the genes that were found by GLIMMER 2.0 but missed in the original annotation, and that have detectable homology to a coding region from another organism. For each of the 13 genes identified, the table lists the function and identifier of the best hit found by a BLAST search. Eleven of the genes occur in intergenic regions in the published annotation of the complete genome, and the remaining two (those whose closest homologs are P17996 and Q02541) have relatively small overlaps with coding sequences annotated as hypothetical. GLIMMER 1.0 finds 11 of these 13 genes, missing those homologous to P17996 and Q02541.
Table 5. Genes in M.tuberculosis found automatically by GLIMMER 2.0 with homology to protein sequences from other organisms
All but two (homologous to P15026 and Q02541) of the listed genes are intergenic with respect to the currently published annotation for M.tuberculosis. The first three columns list the location of the predicted start and stop codons and the length in base pairs; if Start > Stop then the coding sequence is on the reverse strand. The last three columns give the GenBank accession number, the function of the top hit found by BLAST (23), and the E-value given by BLAST for that hit. (The E-value is the number of homologous sequences expected by chance.)
It is worth noting too that the false-positive rate appears to be higher for GLIMMER 2.0, as reflected in the fact that the number of additional genes (not confirmed by database matches) predicted by GLIMMER 2.0 is higher in nine of the 10 genomes. Because of its revised rules to resolve overlapping ORFs, GLIMMER 2.0 generally makes more gene predictions than GLIMMER 1.0 when all parameters are set identically as in the above-described results. To verify that the additional annotated matches found by GLIMMER 2.0 are not attributable merely to the greater number of predictions, we compared the two systems with GLIMMER 1.0's parameters set so that the total additional gene predictions for all 10 genomes matched GLIMMER 2.0. Specifically, we raised the overlap-length parameter, which is the maximum number of DNA bases by which two ORFs can overlap and both still be predicted as genes. The results are shown in Table 6. With this adjustment GLIMMER 2.0 still finds 99 more annotated genes than GLIMMER 1.0, indicating that its predictions are in fact more accurate than GLIMMER 1.0. The parameters of either system can be adjusted to reduce the number of additional genes, at the cost of missing some true genes.
Table 6. GLIMMER 1.0 accuracy versus GLIMMER 2.0 accuracy with overlap-length parameter of GLIMMER 1.0 raised to 51
The value 51 was chosen to make the total number of additional genes found by GLIMMER 1.0 as close as possible to the corresponding number for GLIMMER 2.0. GLIMMER 2.0 still finds significantly more annotated genes than GLIMMER 1.0.
CONCLUSION
In this paper we have described several technical improvements made in the GLIMMER 2.0 gene-finding system and argued that the system is more accurate than previously recognized. GLIMMER 2.0 also can be an effective gene finder for eukaryotic genomes, especially those with a high gene density as is found in some parasites. For example, it is being used as the main gene finder for the parasite Trypanosoma brucei, the agent that causes African sleeping sickness, which currently is being sequenced at The Institute for Genomic Research. This parasite has few or no introns and a gene density estimated at 50%. The IMM scoring method in GLIMMER 1.0 has also been used to create a eukaryotic gene finder, GLIMMERM, that has been quite successful in finding genes in the genome of Plasmodium falciparum, the malaria parasite (22).
ACKNOWLEDGEMENTS
A.L.D. was supported by NSF Grant IIS-9820497. S.K. was supported by NSF Grant KDI-9980088. O.W. was supported by the Department of Energy Grant No. DE-FC02-95ER61962.A003. S.L.S. was supported by NSF Grant IIS-9902923 and NIH Grants R01 LM06845-01 and K01-HG00022-1. S.L.S. and A.L.D. were supported by NSF Grant IRI-9530462.
REFERENCES
*To whom correspondence should be addressed at: Department of Computer Science, Loyola College in Maryland, Baltimore, MD 21210, USA. Tel: +1 410 617 2740; Fax: +1 410 617 2157; Email: delcher{at}cs.loyola.edu
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: jnl.info{at}oup.co.uk
Last modification:
Copyright© Oxford University Press, 1999.
This article has been cited by other articles:
![]() |
I. Uchiyama, T. Higuchi, and M. Kawai MBGD update 2010: toward a comprehensive resource for exploring microbial genome diversity Nucleic Acids Res., November 11, 2009; (2009) gkp948v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Davidsen, E. Beck, A. Ganapathy, R. Montgomery, N. Zafar, Q. Yang, R. Madupu, P. Goetz, K. Galinsky, O. White, et al. The comprehensive microbial resource Nucleic Acids Res., November 5, 2009; (2009) gkp912v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. S. Turner, T. Kanamoto, T. Unoki, C. L. Munro, H. Wu, and T. Kitten Comprehensive Evaluation of Streptococcus sanguinis Cell Wall-Anchored Proteins in Early Infective Endocarditis Infect. Immun., November 1, 2009; 77(11): 4966 - 4975. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. McBride, G. Xie, E. C. Martens, A. Lapidus, B. Henrissat, R. G. Rhodes, E. Goltsman, W. Wang, J. Xu, D. W. Hunnicutt, et al. Novel Features of the Polysaccharide-Digesting Gliding Bacterium Flavobacterium johnsoniae as Revealed by Genome Sequence Analysis Appl. Envir. Microbiol., November 1, 2009; 75(21): 6864 - 6875. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-F. Ma, Y. Zhang, J.-Y. Zhang, D.-W. Chen, Y. Zhu, H. Zheng, S.-Y. Wang, C.-Y. Jiang, G.-P. Zhao, and S.-J. Liu The Complete Genome of Comamonas testosteroni Reveals Its Genetic Adaptations to Changing Environments Appl. Envir. Microbiol., November 1, 2009; 75(21): 6812 - 6819. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Giannakis, H. K. Backhed, S. L. Chen, J. J. Faith, M. Wu, J. L. Guruge, L. Engstrand, and J. I. Gordon Response of Gastric Epithelial Progenitors to Helicobacter pylori Isolates Obtained from Swedish Patients with Chronic Atrophic Gastritis J. Biol. Chem., October 30, 2009; 284(44): 30383 - 30394. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Loftie-Eaton and D. E. Rawlings Comparative Biology of Two Natural Variants of the IncQ-2 Family Plasmids, pRAS3.1 and pRAS3.2 J. Bacteriol., October 15, 2009; 191(20): 6436 - 6446. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. Smyth and D. A. Robinson Integrative and Sequence Characteristics of a Novel Genetic Element, ICE6013, in Staphylococcus aureus J. Bacteriol., October 1, 2009; 191(19): 5964 - 5975. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Azuma, A. Hosoyama, M. Matsutani, N. Furuya, H. Horikawa, T. Harada, H. Hirakawa, S. Kuhara, K. Matsushita, N. Fujita, et al. Whole-genome analyses reveal genetic instability of Acetobacter pasteurianus Nucleic Acids Res., September 1, 2009; 37(17): 5768 - 5783. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Senty Turner, S. Das, T. Kanamoto, C. L. Munro, and T. Kitten Development of genetic tools for in vivo virulence analysis of Streptococcus sanguinis Microbiology, August 1, 2009; 155(8): 2573 - 2582. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.-M. Wu, L.-H. Li, J.-J. Yan, N. Tsao, T.-L. Liao, H.-C. Tsai, C.-P. Fung, H.-J. Chen, Y.-M. Liu, J.-T. Wang, et al. Genome Sequencing and Comparative Analysis of Klebsiella pneumoniae NTUH-K2044, a Strain Causing Liver Abscess and Meningitis J. Bacteriol., July 15, 2009; 191(14): 4492 - 4501. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Schubbe, T. J. Williams, G. Xie, H. E. Kiss, T. S. Brettin, D. Martinez, C. A. Ross, D. Schuler, B. L. Cox, K. H. Nealson, et al. Complete Genome Sequence of the Chemolithoautotrophic Marine Magnetotactic Coccus Strain MC-1 Appl. Envir. Microbiol., July 15, 2009; 75(14): 4835 - 4852. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. A. Goltsman, V. J. Denef, S. W. Singer, N. C. VerBerkmoes, M. Lefsrud, R. S. Mueller, G. J. Dick, C. L. Sun, K. E. Wheeler, A. Zemla, et al. Community Genomic and Proteomic Analyses of Chemoautotrophic Iron-Oxidizing "Leptospirillum rubarum" (Group II) and "Leptospirillum ferrodiazotrophum" (Group III) Bacteria in Acid Mine Drainage Biofilms Appl. Envir. Microbiol., July 1, 2009; 75(13): 4599 - 4615. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. V. Mardanov, N. V. Ravin, V. A. Svetlitchnyi, A. V. Beletsky, M. L. Miroshnichenko, E. A. Bonch-Osmolovskaya, and K. G. Skryabin Metabolic Versatility and Indigenous Origin of the Archaeon Thermococcus sibiricus, Isolated from a Siberian Oil Reservoir, as Revealed by Genome Analysis Appl. Envir. Microbiol., July 1, 2009; 75(13): 4580 - 4588. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Gattiker, C. Dessimoz, A. Schneider, I. Xenarios, M. Pagni, and J. Rougemont The Microbe browser for comparative genomics Nucleic Acids Res., July 1, 2009; 37(suppl_2): W296 - W299. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Jogler, W. Lin, A. Meyerdierks, M. Kube, E. Katzmann, C. Flies, Y. Pan, R. Amann, R. Reinhardt, and D. Schuler Toward Cloning of the Magnetotactic Metagenome: Identification of Magnetosome Island Gene Clusters in Uncultivated Magnetotactic Bacteria from Different Aquatic Sediments Appl. Envir. Microbiol., June 15, 2009; 75(12): 3972 - 3979. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lim, T.-H. Lee, B. H. Nahm, Y. D. Choi, M. Kim, and I. Hwang Complete Genome Sequence of Burkholderia glumae BGR1 J. Bacteriol., June 1, 2009; 191(11): 3758 - 3759. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Elias, A. Mukhopadhyay, M. P. Joachimiak, E. C. Drury, A. M. Redding, H.-C. B. Yen, M. W. Fields, T. C. Hazen, A. P. Arkin, J. D. Keasling, et al. Expression profiling of hypothetical genes in Desulfovibrio vulgaris leads to improved functional annotation Nucleic Acids Res., May 1, 2009; 37(9): 2926 - 2939. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A. O'May, S. M. Jacobsen, M. Longwell, P. Stoodley, H. L. T. Mobley, and M. E. Shirtliff The high-affinity phosphate transporter Pst in Proteus mirabilis HI4320 and its importance in biofilm formation Microbiology, May 1, 2009; 155(5): 1523 - 1535. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. V. Ravin, A. V. Mardanov, A. V. Beletsky, I. V. Kublanov, T. V. Kolganova, A. V. Lebedinsky, N. A. Chernyh, E. A. Bonch-Osmolovskaya, and K. G. Skryabin Complete Genome Sequence of the Anaerobic, Protein-Degrading Hyperthermophilic Crenarchaeon Desulfurococcus kamchatkensis J. Bacteriol., April 1, 2009; 191(7): 2371 - 2379. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Moynihan, J. P. Morrissey, E. R. Coppoolse, W. J. Stiekema, F. O'Gara, and E. F. Boyd Evolutionary History of the phl Gene Cluster in the Plant-Associated Bacterium Pseudomonas fluorescens Appl. Envir. Microbiol., April 1, 2009; 75(7): 2122 - 2131. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Kawai, J. Kusnadi, R. Kemperman, J. Kok, Y. Ito, M. Endo, K. Arakawa, H. Uchida, J. Nishimura, H. Kitazawa, et al. DNA Sequencing and Homologous Expression of a Small Peptide Conferring Immunity to Gassericin A, a Circular Bacteriocin Produced by Lactobacillus gasseri LA39 Appl. Envir. Microbiol., March 1, 2009; 75(5): 1324 - 1330. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Smollett, A. S. Fivian-Hughes, J. E. Smith, A. Chang, T. Rao, and E. O. Davis Experimental determination of translational start sites resolves uncertainties in genomic open reading frame predictions - application to Mycobacterium tuberculosis Microbiology, January 1, 2009; 155(1): 186 - 197. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. T. G. Holden, H. M. B. Seth-Smith, L. C. Crossman, M. Sebaihia, S. D. Bentley, A. M. Cerdeno-Tarraga, N. R. Thomson, N. Bason, M. A. Quail, S. Sharp, et al. The Genome of Burkholderia cenocepacia J2315, an Epidemic Pathogen of Cystic Fibrosis Patients J. Bacteriol., January 1, 2009; 191(1): 261 - 277. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Kunin, A. Copeland, A. Lapidus, K. Mavromatis, and P. Hugenholtz A Bioinformatician's Guide to Metagenomics Microbiol. Mol. Biol. Rev., December 1, 2008; 72(4): 557 - 578. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Shen, Y. Jiang, Z. Zhou, J. Zhang, Y. Yu, and L. Li Complete nucleotide sequence of pKP96, a 67 850 bp multiresistance plasmid encoding qnrA1, aac(6')-Ib-cr and blaCTX-M-24 from Klebsiella pneumoniae J. Antimicrob. Chemother., December 1, 2008; 62(6): 1252 - 1256. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Oshima, H. Toh, Y. Ogura, H. Sasamoto, H. Morita, S.-H. Park, T. Ooka, S. Iyoda, T. D. Taylor, T. Hayashi, et al. Complete Genome Sequence and Comparative Analysis of the Wild-type Commensal Escherichia coli Strain SE11 Isolated from a Healthy Adult DNA Res, December 1, 2008; 15(6): 375 - 386. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Noguchi, T. Taniguchi, and T. Itoh MetaGeneAnnotator: Detecting Species-Specific Patterns of Ribosomal Binding Site for Precise Gene Prediction in Anonymous Prokaryotic and Phage Genomes DNA Res, December 1, 2008; 15(6): 387 - 396. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. J. G. van de Werken, M. R. A. Verhaart, A. L. VanFossen, K. Willquist, D. L. Lewis, J. D. Nichols, H. P. Goorissen, E. F. Mongodin, K. E. Nelson, E. W. J. van Niel, et al. Hydrogenomics of the Extremely Thermophilic Bacterium Caldicellulosiruptor saccharolyticus Appl. Envir. Microbiol., November 1, 2008; 74(21): 6720 - 6729. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. E. Mattes, A. K. Alexander, P. M. Richardson, A. C. Munk, C. S. Han, P. Stothard, and N. V. Coleman The Genome of Polaromonas sp. Strain JS666: Insights into the Evolution of a Hydrocarbon- and Xenobiotic-Degrading Bacterium, and Features of Relevance to Biotechnology Appl. Envir. Microbiol., October 15, 2008; 74(20): 6405 - 6416. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Welsh, M. Liberton, J. Stockel, T. Loh, T. Elvitigala, C. Wang, A. Wollam, R. S. Fulton, S. W. Clifton, J. M. Jacobs, et al. The genome of Cyanothece 51142, a unicellular diazotrophic cyanobacterium important in the marine nitrogen cycle PNAS, September 30, 2008; 105(39): 15094 - 15099. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Dybvig, C. Zuhua, P. Lao, D. S. Jordan, C. T. French, A.-H. T. Tu, and A. E. Loraine Genome of Mycoplasma arthritidis Infect. Immun., September 1, 2008; 76(9): 4000 - 4008. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. T. Chung, J. S. Yoo, H. B. Oh, Y. S. Lee, S. H. Cha, S. J. Kim, and C. K. Yoo Complete Genome Sequence of Neisseria gonorrhoeae NCCP11945 J. Bacteriol., September 1, 2008; 190(17): 6035 - 6036. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Letek, A. A. Ocampo-Sosa, M. Sanders, U. Fogarty, T. Buckley, D. P. Leadon, P. Gonzalez, M. Scortti, W. G. Meijer, J. Parkhill, et al. Evolution of the Rhodococcus equi vap Pathogenicity Island Seen through Comparison of Host-Associated vapA and vapB Virulence Plasmids J. Bacteriol., September 1, 2008; 190(17): 5797 - 5805. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Paul, S. Bridges, S. C. Burgess, Y. Dandass, and M. L. Lawrence Genome Sequence of the Chemolithoautotrophic Bacterium Oligotropha carboxidovorans OM5T J. Bacteriol., August 1, 2008; 190(15): 5531 - 5532. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. J. van Zyl, S. M. Deane, L.-A. Louw, and D. E. Rawlings Presence of a Family of Plasmids (29 to 65 Kilobases) with a 26-Kilobase Common Region in Different Strains of the Sulfur-Oxidizing Bacterium Acidithiobacillus caldus Appl. Envir. Microbiol., July 15, 2008; 74(14): 4300 - 4308. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-F. Dubern, E. R. Coppoolse, W. J. Stiekema, and G. V. Bloemberg Genetic and functional characterization of the gene cluster directing the biosynthesis of putisolvin I and II in Pseudomonas putida strain PCL1445 Microbiology, July 1, 2008; 154(7): 2070 - 2083. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Iacono, L. Villa, D. Fortini, R. Bordoni, F. Imperi, R. J. P. Bonnal, T. Sicheritz-Ponten, G. De Bellis, P. Visca, A. Cassone, et al. Whole-Genome Pyrosequencing of an Epidemic Multidrug-Resistant Acinetobacter baumannii Strain Belonging to the European Clone II Group Antimicrob. Agents Chemother., July 1, 2008; 52(7): 2616 - 2625. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Takarada, M. Sekine, H. Kosugi, Y. Matsuo, T. Fujisawa, S. Omata, E. Kishi, A. Shimizu, N. Tsukatani, S. Tanikawa, et al. Complete Genome Sequence of the Soil Actinomycete Kocuria rhizophila J. Bacteriol., June 15, 2008; 190(12): 4139 - 4146. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Higashide, M. Kuroda, C. T. N. Omura, M. Kumano, S. Ohkawa, S. Ichimura, and T. Ohta Methicillin-Resistant Staphylococcus saprophyticus Isolates Carrying Staphylococcal Cassette Chromosome mec Have Emerged in Urogenital Tract Infections Antimicrob. Agents Chemother., June 1, 2008; 52(6): 2061 - 2068. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. T. T. Tran-Nguyen, M. Kube, B. Schneider, R. Reinhardt, and K. S. Gibb Comparative Genome Analysis of "Candidatus Phytoplasma australiense" (Subgroup tuf-Australia I; rp-A) and "Ca. Phytoplasma asteris" Strains OY-M and AY-WB J. Bacteriol., June 1, 2008; 190(11): 3979 - 3991. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Morita, H. Toh, S. Fukuda, H. Horikawa, K. Oshima, T. Suzuki, M. Murakami, S. Hisamatsu, Y. Kato, T. Takizawa, et al. Comparative Genome Analysis of Lactobacillus reuteri and Lactobacillus fermentum Reveal a Genomic Island for Reuterin and Cobalamin Production DNA Res, June 1, 2008; 15(3): 151 - 161. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. P. Stinear, T. Seemann, P. F. Harrison, G. A. Jenkin, J. K. Davies, P. D.R. Johnson, Z. Abdellah, C. Arrowsmith, T. Chillingworth, C. Churcher, et al. Insights from the complete genome sequence of Mycobacterium marinum on the evolution of Mycobacterium tuberculosis Genome Res., May 1, 2008; 18(5): 729 - 741. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Anderson, J. Rodriguez, D. Susanti, I. Porat, C. Reich, L. E. Ulrich, J. G. Elkins, K. Mavromatis, A. Lykidis, E. Kim, et al. Genome Sequence of Thermofilum pendens Reveals an Exceptional Loss of Biosynthetic Pathways without Genome Reduction J. Bacteriol., April 15, 2008; 190(8): 2957 - 2965. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Caffrey, H. S. Park, J. Been, P. Gordon, C. W. Sensen, and G. Voordouw Gene Expression by the Sulfate-Reducing Bacterium Desulfovibrio vulgaris Hildenborough Grown on an Iron Electrode under Cathodic Protection Conditions Appl. Envir. Microbiol., April 15, 2008; 74(8): 2404 - 2413. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Lanzen and T. Oinn The Taverna Interaction Service: enabling manual interaction in workflows Bioinformatics, April 15, 2008; 24(8): 1118 - 1120. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. D. Bentley, C. Corton, S. E. Brown, A. Barron, L. Clark, J. Doggett, B. Harris, D. Ormond, M. A. Quail, G. May, et al. Genome of the Actinomycete Plant Pathogen Clavibacter michiganensis subsp. sepedonicus Suggests Recent Niche Adaptation J. Bacteriol., March 15, 2008; 190(6): 2150 - 2160. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Ansong, S. O. Purvine, J. N. Adkins, M. S. Lipton, and R. D. Smith Proteogenomics: needs and roles to be filled by proteomics in genome annotation Brief Funct Genomic Proteomic, March 10, 2008; (2008) eln010v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Yoshida, K. Nagasaki, Y. Takashima, Y. Shirai, Y. Tomaru, Y. Takao, S. Sakamoto, S. Hiroishi, and H. Ogata Ma-LMM01 Infecting Toxic Microcystis aeruginosa Illuminates Diverse Cyanophage Genome Strategies J. Bacteriol., March 1, 2008; 190(5): 1762 - 1772. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kosaka, S. Kato, T. Shimoyama, S. Ishii, T. Abe, and K. Watanabe The genome of Pelotomaculum thermopropionicum reveals niche-associated evolution in anaerobic microbiota Genome Res., March 1, 2008; 18(3): 442 - 448. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Boyer, J. Haurat, S. Samain, B. Segurens, F. Gavory, V. Gonzalez, P. Mavingui, R. Rohr, R. Bally, and F. Wisniewski-Dye Bacteriophage Prevalence in the Genus Azospirillum and Analysis of the First Genome Sequence of an Azospirillum brasilense Integrative Phage Appl. Envir. Microbiol., February 1, 2008; 74(3): 861 - 874. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Siezen, M. J. C. Starrenburg, J. Boekhorst, B. Renckens, D. Molenaar, and J. E. T. van Hylckama Vlieg Genome-Scale Genotype-Phenotype Matching of Two Lactococcus lactis Isolates from Plants Identifies Mechanisms of Adaptation to the Plant Niche Appl. Envir. Microbiol., January 15, 2008; 74(2): 424 - 436. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kaneko, N. Nakajima, S. Okamoto, I. Suzuki, Y. Tanabe, M. Tamaoki, Y. Nakamura, F. Kasai, A. Watanabe, K. Kawashima, et al. Complete Genomic Structure of the Bloom-forming Toxic Cyanobacterium Microcystis aeruginosa NIES-843 DNA Res, January 11, 2008; (2008) dsm026v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Meisinger-Henschel, M. Schmidt, S. Lukassen, B. Linke, L. Krause, S. Konietzny, A. Goesmann, P. Howley, P. Chaplin, M. Suter, et al. Genomic sequence of chorioallantois vaccinia virus Ankara, the ancestor of modified vaccinia virus Ankara J. Gen. Virol., December 1, 2007; 88(12): 3249 - 3259. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Chi, L. Valenzuela, S. Beard, A. J. Mackey, J. Shabanowitz, D. F. Hunt, and C. A. Jerez Periplasmic Proteins of the Extremophile Acidithiobacillus ferrooxidans: A High Throughput Proteomics Analysis Mol. Cell. Proteomics, December 1, 2007; 6(12): 2239 - 2251. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Zuber, C. Ngom-Bru, C. Barretto, A. Bruttin, H. Brussow, and E. Denou Genome Analysis of Phage JS98 Defines a Fourth Major Subgroup of T4-Like Phages in Escherichia coli J. Bacteriol., November 15, 2007; 189(22): 8206 - 8214. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. I. Rzhepishevska, J. Valdes, L. Marcinkeviciene, C. A. Gallardo, R. Meskys, V. Bonnefoy, D. S. Holmes, and M. Dopson Regulation of a Novel Acidithiobacillus caldus Gene Cluster Involved in Metabolism of Reduced Inorganic Sulfur Compounds Appl. Envir. Microbiol., November 15, 2007; 73(22): 7367 - 7372. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Golebiewski, I. Kern-Zdanowicz, M. Zienkiewicz, M. Adamczyk, J. Zylinska, A. Baraniak, M. Gniadkowski, J. Bardowski, and P. Ceglowski Complete Nucleotide Sequence of the pCTX-M3 Plasmid and Its Involvement in Spread of the Extended-Spectrum {beta}-Lactamase Gene blaCTX-M-3 Antimicrob. Agents Chemother., November 1, 2007; 51(11): 3789 - 3795. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Wang, L. Reitzer, D. A. Rasko, M. M. Pearson, R. J. Blick, C. Laurence, and E. J. Hansen Metabolic Analysis of Moraxella catarrhalis and the Effect of Selected In Vitro Growth Conditions on Global Gene Expression Infect. Immun., October 1, 2007; 75(10): 4959 - 4971. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Saeys, I. Inza, and P. Larranaga A review of feature selection techniques in bioinformatics Bioinformatics, October 1, 2007; 23(19): 2507 - 2517. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Ogata and J.-M. Claverie Unique genes in giant viruses: Regular substitution pattern and anomalously short size Genome Res., September 1, 2007; 17(9): 1353 - 1361. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Wei, J. H. McCusker, R. W. Hyman, T. Jones, Y. Ning, Z. Cao, Z. Gu, D. Bruno, M. Miranda, M. Nguyen, et al. Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789 PNAS, July 31, 2007; 104(31): 12825 - 12830. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Richter, M. Kube, D. A. Bazylinski, T. Lombardot, F. O. Glockner, R. Reinhardt, and D. Schuler Comparative Genome Analysis of Four Magnetotactic Bacteria Reveals a Complex Set of Group-Specific Genes Implicated in Magnetosome Biomineralization and Function J. Bacteriol., July 1, 2007; 189(13): 4899 - 4910. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Saeys, T. Abeel, S. Degroeve, and Y. Van de Peer Translation initiation site prediction on a genomic scale: beauty in simplicity Bioinformatics, July 1, 2007; 23(13): i418 - i423. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. Udwary, L. Zeigler, R. N. Asolkar, V. Singan, A. Lapidus, W. Fenical, P. R. Jensen, and B. S. Moore Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica PNAS, June 19, 2007; 104(25): 10376 - 10381. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Geslin, M. Gaillard, D. Flament, K. Rouault, M. Le Romancer, D. Prieur, and G. Erauso Analysis of the First Genome of a Hyperthermophilic Marine Virus-Like Particle, PAV1, Isolated from Pyrococcus abyssi J. Bacteriol., June 15, 2007; 189(12): 4510 - 4519. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Asgari, J. Davis, D. Wood, P. Wilson, and A. McGrath Sequence and organization of the Heliothis virescens ascovirus genome J. Gen. Virol., April 1, 2007; 88(4): 1120 - 1132. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Yukawa, C. A. Omumasaba, H. Nonaka, P. Kos, N. Okai, N. Suzuki, M. Suda, Y. Tsuge, J. Watanabe, Y. Ikeda, et al. Comparative analysis of the Corynebacterium glutamicum group and complete genome sequence of strain R Microbiology, April 1, 2007; 153(4): 1042 - 1058. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Feng, W. Wang, J. Cheng, Y. Ren, G. Zhao, C. Gao, Y. Tang, X. Liu, W. Han, X. Peng, et al. Genome and proteome of long-chain alkane degrading Geobacillus thermodenitrificans NG80-2 isolated from a deep-subsurface oil reservoir PNAS, March 27, 2007; 104(13): 5602 - 5607. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Faruque, V. C. Tam, N. Chowdhury, P. Diraphat, M. Dziejman, J. F. Heidelberg, J. D. Clemens, J. J. Mekalanos, and G. B. Nair Genomic analysis of the Mozambique strain of Vibrio cholerae O1 reveals the origin of El Tor strains carrying classical CTX prophage PNAS, March 20, 2007; 104(12): 5151 - 5156. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. L. Delcher, K. A. Bratke, E. C. Powers, and S. L. Salzberg Identifying bacterial genes and endosymbiont DNA with Glimmer Bioinformatics, March 15, 2007; 23(6): 673 - 679. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Severin, E. Nickbarg, J. Wooters, S. A. Quazi, Y. V. Matsuka, E. Murphy, I. K. Moutsatsos, R. J. Zagursky, and S. B. Olmsted Proteomic Analysis and Identification of Streptococcus pyogenes Surface-Associated Proteins J. Bacteriol., March 1, 2007; 189(5): 1514 - 1522. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. F. Challacombe, A. J. Duncan, T. S. Brettin, D. Bruce, O. Chertkov, J. C. Detter, C. S. Han, M. Misra, P. Richardson, R. Tapia, et al. Complete Genome Sequence of Haemophilus somnus (Histophilus somni) Strain 129Pt and Comparison to Haemophilus ducreyi 35000HP and Haemophilus influenzae Rd J. Bacteriol., March 1, 2007; 189(5): 1890 - 1898. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Smith, T. A. Gianoulis, S. Pukatzki, J. J. Mekalanos, L. N. Ornston, M. Gerstein, and M. Snyder New insights into Acinetobacter baumannii pathogenesis revealed by high-density pyrosequencing and transposon mutagenesis Genes & Dev., March 1, 2007; 21(5): 601 - 614. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. M. Fuchs, S. Spring, H. Teeling, C. Quast, J. Wulf, M. Schattenhofer, S. Yan, S. Ferriera, J. Johnson, F. O. Glockner, et al. From the Cover: Characterization of a marine gammaproteobacterium capable of aerobic anoxygenic photosynthesis PNAS, February 20, 2007; 104(8): 2891 - 2896. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Sletvold, P. J. Johnsen, G. S. Simonsen, B. Aasnaes, A. Sundsfjord, and K. M. Nielsen Comparative DNA Analysis of Two vanA Plasmids from Enterococcus faecium Strains Isolated from Poultry and a Poultry Farmer in Norway Antimicrob. Agents Chemother., February 1, 2007; 51(2): 736 - 739. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. H. Bergman, K. D. Passalacqua, P. C. Hanna, and Z. S. Qin Operon Prediction for Sequenced Bacterial Genomes without Experimental Information Appl. Envir. Microbiol., February 1, 2007; 73(3): 846 - 854. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Krause, A. C. McHardy, T. W. Nattkemper, A. Puhler, J. Stoye, and F. Meyer GISMO--gene identification using a support vector machine for ORF classification Nucleic Acids Res., January 28, 2007; 35(2): 540 - 549. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Sugawara, T. Abe, T. Gojobori, and Y. Tateno DDBJ working on evaluation and classification of bacterial genes in INSDC Nucleic Acids Res., January 12, 2007; 35(suppl_1): D13 - D15. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. E. Snyder, N. Kampanya, J. Lu, E. K. Nordberg, H. R. Karur, M. Shukla, J. Soneja, Y. Tian, T. Xue, H. Yoo, et al. PATRIC: The VBI PathoSystems Resource Integration Center Nucleic Acids Res., January 12, 2007; 35(suppl_1): D401 - D406. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. M. McCarthy, S. M. Bridges, N. Wang, G. B. Magee, W. P. Williams, D. S. Luthe, and S. C. Burgess AgBase: a unified resource for functional analysis in agriculture Nucleic Acids Res., January 12, 2007; 35(suppl_1): D599 - D603. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Lanie, W.-L. Ng, K. M. Kazmierczak, T. M. Andrzejewski, T. M. Davidsen, K. J. Wayne, H. Tettelin, J. I. Glass, and M. E. Winkler Genome Sequence of Avery's Virulent Serotype 2 Strain D39 of Streptococcus pneumoniae and Comparison with That of Unencapsulated Laboratory Strain R6 J. Bacteriol., January 1, 2007; 189(1): 38 - 51. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Kotze, I. M. Tuffin, S. M. Deane, and D. E. Rawlings Cloning and characterization of the chromosomal arsenic resistance genes from Acidithiobacillus caldus and enhanced arsenic resistance on conjugal transfer of ars genes located on transposon TnAtcArs Microbiology, December 1, 2006; 152(12): 3551 - 3560. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Brotcke, D. S. Weiss, C. C. Kim, P. Chain, S. Malfatti, E. Garcia, and D. M. Monack Identification of MglA-Regulated Genes Reveals Novel Virulence Factors in Francisella tularensis Infect. Immun., December 1, 2006; 74(12): 6642 - 6655. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Rohmer, M. Brittnacher, K. Svensson, D. Buckley, E. Haugen, Y. Zhou, J. Chang, R. Levy, H. Hayden, M. Forsman, et al. Potential Source of Francisella tularensis Live Vaccine Strain Attenuation Determined by Genome Comparison Infect. Immun., December 1, 2006; 74(12): 6895 - 6906. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Choulet, B. Aigle, A. Gallois, S. Mangenot, C. Gerbaud, C. Truong, F.-X. Francou, C. Fourrier, M. Guerineau, B. Decaris, et al. Evolution of the Terminal Regions of the Streptomyces Linear Chromosome Mol. Biol. Evol., December 1, 2006; 23(12): 2361 - 2369. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





























