Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (134K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1411)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Edgar, R. C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Edgar, R. C.
Related Collections
Right arrow Computational methods
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published online 19 March 2004

Nucleic Acids Research, 2004, Vol. 32, No. 5 1792-1797
Oxford University Press

MUSCLE: multiple sequence alignment with high accuracy and high throughput

Robert C. Edgar*

195 Roque Moraes Drive, Mill Valley, CA 94941, USA

*Email: bob{at}drive5.com

Received January 19, 2004; Revised January 30, 2004; Accepted February 24, 2004


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MUSCLE algorithm
 Assessment
 RESULTS
 DISCUSSION
 REFERENCES
 
We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MUSCLE algorithm
 Assessment
 RESULTS
 DISCUSSION
 REFERENCES
 
Multiple alignments of protein sequences are important in many applications, including phylogenetic tree estimation, structure prediction and critical residue identification. The most natural formulation of the computational problem is to define a model of sequence evolution that assigns probabilities to elementary sequence edits and seeks a most probable directed graph in which edges represent edits and terminal nodes are the observed sequences. No tractable method for finding such a graph is known. A heuristic alternative is to seek a multiple alignment that optimizes the sum of pairs (SP) score, i.e. the sum of pairwise alignment scores. Optimizing the SP score is NP complete (1) and can be achieved by dynamic programming with time and space complexity O(LN) in the sequence length L and number of sequences N (2). A more popular strategy is the progressive method (3,4), which first estimates a tree and then constructs a pairwise alignment of the subtrees found at each internal node. A subtree is represented by its profile, a multiple alignment treated as a sequence by regarding each column as an alignable symbol. A variant on this strategy is used by T-Coffee (5), which aligns profiles by optimizing a score derived from local and global alignments of all pairs of input sequences. Misalignments by progressive methods are sometimes readily apparent (Fig. 1), motivating further processing (refinement). For a recent review of multiple alignment methods, see Notredame (6). Here we describe MUSCLE (multiple sequence comparison by log-expectation), a new computer program for multiple protein sequence alignment.



View larger version (32K):
[in this window]
[in a new window]
 
Figure 1. Motifs misaligned by a progressive method. A set of 41 sequences containing SH2 domains (44) were aligned by the progressive method T-Coffee (above), and by MUSCLE (below). The N-terminal region of a subset of five sequences is shown. The highlighted columns (upper case) are conserved within this family but are misaligned by T-Coffee. It should be noted that T-Coffee aligns these motifs correctly when given these five sequences alone; the problem arises in the context of the other sequences. Complete alignments are available at http://www.drive5.com/muscle.

 

    MUSCLE algorithm
 TOP
 ABSTRACT
 INTRODUCTION
 MUSCLE algorithm
 Assessment
 RESULTS
 DISCUSSION
 REFERENCES
 
Here we give an overview of the algorithm; a more detailed discussion is given in Edgar (submitted). Following guide tree construction, the fundamental step is pairwise profile alignment, which is used first for progressive alignment and then for refinement. This is similar to the strategies used by PRRP (7) and MAFFT (8).

Distance measures and guide tree estimation
MUSCLE uses two distance measures for a pair of sequences: a kmer distance (for an unaligned pair) and the Kimura distance (for an aligned pair). A kmer is a contiguous subsequence of length k, also known as a word or k-tuple. Related sequences tend to have more kmers in common than expected by chance. The kmer distance is derived from the fraction of kmers in common in a compressed alphabet, which we have previously shown to correlate well with fractional identity (9). This measure does not require an alignment, giving a significant speed advantage. Given an aligned pair of sequences, we compute the pairwise identity and convert to an additive distance estimate, applying the Kimura correction for multiple substitutions at a single site (10). Distance matrices are clustered using UPGMA (11), which we find to give slightly improved results over neighbor-joining (12), despite the expectation that neighbor-joining will give a more reliable estimate of the evolutionary tree. This can be explained by assuming that in progressive alignment, the best accuracy is obtained at each node by aligning the two profiles that have fewest differences, even if they are not evolutionary neighbors.

Profile alignment
In order to apply pairwise alignment to profiles, a scoring function must be defined on an aligned pair of profile positions, i.e. a pair of multiple alignment columns [see, for example Edgar and Sjolander (13)]. Let i and j be amino acid types, pi the background probability of i, pij the joint probability of i and j being aligned to each other, fxi the observed frequency of i in column x of the first profile, and f xG the observed frequency of gaps in that column at position x in the family (similarly for position y in the second profile). The estimated probability {alpha}xi of observing amino acid i in position x can be derived from fx, typically by adding heuristic pseudo-counts or by using Bayesian methods such as Dirichlet mixture priors (14). MUSCLE uses a new profile function we call the log-expectation (LE) score:

LExy = (1 – f xG) (1 – f yG) log {Sigma} i {Sigma} j f xi f yj pij/pi pj1

This is a modified version of the log-average function (15):

LAxy = log {Sigma} i {Sigma} j {alpha}xi {alpha}yj pij/pi pj2

MUSCLE uses probabilities pi and pij derived from the 240 PAM VTML matrix (16). Frequencies fi are normalized to sum to 1 when indels are present (otherwise the logarithm becomes increasingly negative with increasing numbers of gaps even when aligning conserved or similar residues). The factor (1 – fG) is the occupancy of a column, introduced to encourage more highly occupied columns to align. Position-specific gap penalties are used, employing heuristics similar to those found in MAFFT and LAGAN (17).

Algorithm
The high-level flow is depicted in Figure 2.



View larger version (29K):
[in this window]
[in a new window]
 
Figure 2. This diagram summarizes the flow of the MUSCLE algorithm. There are three main stages: Stage 1 (draft progressive), Stage 2 (improved progressive) and Stage 3 (refinement). A multiple alignment is available at the completion of each stage, at which point the algorithm may terminate.

 

Stage 1, Draft progressive. The goal of the first stage is to produce a multiple alignment, emphasizing speed over accuracy.

1.1 The kmer distance is computed for each pair of input sequences, giving distance matrix D1.

1.2 Matrix D1 is clustered by UPGMA, producing binary tree TREE1.

1.3 A progressive alignment is constructed by following the branching order of TREE1. At each leaf, a profile is constructed from an input sequence. Nodes in the tree are visited in prefix order (children before their parent). At each internal node, a pairwise alignment is constructed of the two child profiles, giving a new profile which is assigned to that node. This produces a multiple alignment of all input sequences, MSA1, at the root.

Stage 2, Improved progressive. The main source of error in the draft progressive stage is the approximate kmer distance measure, which results in a suboptimal tree. MUSCLE therefore re-estimates the tree using the Kimura distance, which is more accurate but requires an alignment.

2.1 The Kimura distance for each pair of input sequences is computed from MSA1, giving distance matrix D2.

2.2 Matrix D2 is clustered by UPGMA, producing binary tree TREE2.

2.3 A progressive alignment is produced following TREE2 (similar to 1.3), producing multiple alignment MSA2. This is optimized by computing alignments only for subtrees whose branching orders changed relative to TREE1.

Stage 3, Refinement.

3.1 An edge is chosen from TREE2 (edges are visited in order of decreasing distance from the root).

3.2 TREE2 is divided into two subtrees by deleting the edge. The profile of the multiple alignment in each subtree is computed.

3.3 A new multiple alignment is produced by re-aligning the two profiles.

3.4 If the SP score is improved, the new alignment is kept, otherwise it is discarded.

Steps 3.1–3.4 are repeated until convergence or until a user-defined limit is reached. This is a variant of tree-dependent restricted partitioning (18).

Complete multiple alignments are available at steps 1.3, 2.3 and 3.4, at which points the algorithm may be terminated. We refer to the first two stages alone as MUSCLE-p, which produces MSA2. MUSCLE-p has time complexity O(N2L + NL2) and space complexity O(N2 + NL + L2). Refinement adds an O(N3L) term to the time complexity.


    Assessment
 TOP
 ABSTRACT
 INTRODUCTION
 MUSCLE algorithm
 Assessment
 RESULTS
 DISCUSSION
 REFERENCES
 
We assessed the performance of MUSCLE on four sets of reference alignments: BAliBASE (19,20), SABmark (21), SMART (2224) and a new benchmark, PREFAB. We compared these with four other methods: CLUSTALW (25), probably the most widely used program at the time of writing; T-Coffee, which has the best BAliBASE score reported to date; and two MAFFT scripts: FFTNS1, the fastest previously published method known to the author (in which diagonal finding by fast Fourier transform is enabled and a progressive alignment constructed), and NWNSI, the slowest but most accurate of the MAFFT methods (in which fast Fourier transform is disabled and refinement is enabled). Tested versions were MUSCLE 3.2, CLUSTALW 1.82, T-Coffee 1.37 and MAFFT 3.82. We also evaluated MUSCLE-p, in which the refinement stage is omitted. We also tried Align-m 1.0 (21), but found in many cases that the program either aborted or was impractically slow on the larger alignments found in SMART and PREFAB.

BAliBASE. We used version 2 of the BAliBASE benchmark, reference sets Ref 1–Ref 5. Other reference sets contain repeats, inversions and transmembrane helices, for which none of the tested algorithms is designed.

SABmark. We used version 1.63 of the SABmark reference alignments, which consists of two subsets: Superfamily and Twilight. All sequences have known structure. The Twilight set contains 1994 domains from the Astral database (26) with pairwise sequence similarity e-values <=1, divided into 236 folds according to the SCOP classification (27). The Superfamily set contains sequences of pairwise identity <=50%, divided into 462 SCOP superfamilies. Each pair of structures was aligned with two structural aligners: SOFI (28) and CE (29), producing a sequence alignment from the consensus in which only high-confidence regions are retained. Input sets range from three to 25 sequences, with an average of eight and an average sequence length of 179.

SMART. SMART contains multiple alignments refined by experts, focusing primarily on signaling domains. While structures were considered where known, sequence methods were also used to aid construction of the database, so SMART is not suitable as a definitive benchmark. However, conventional wisdom [e.g. Fischer et al. (30)] holds that machine-assisted experts can produce superior alignments to automated methods, so performance on this set is of interest for comparison. We used a version of SMART downloaded in July 2000, before the first version of MUSCLE was made available; eliminating the possibility that MUSCLE was used to aid construction. We discarded alignments of more than 100 sequences in order to make the test tractable for T-Coffee, leaving 267 alignments averaging 31 sequences of length 175.

PREFAB. The methods used to create databases such as BAliBASE and SMART are time-consuming and demand significant expertise, making a fully automated protocol desirable. Perhaps the most obvious approach is to generate sequence alignments from automated alignments of multiple structures, but this is fraught with difficulties; see for example Eidhammer et al. (31). With this in mind, we constructed a new test set, PREFAB (protein reference alignment benchmark) which exploits methodology (21,32,33), test data (13,34,35) and statistical methods (19) that have previously been applied to alignment accuracy assessment. The protocol is as follows. Two proteins are aligned by a structural method that does not incorporate sequence similarity. Each sequence is used to query a database, from which high-scoring hits are collected. The queries and their hits are combined and aligned by a multiple sequence method. Accuracy is assessed on the original pair alone, by comparison with their structural alignment. Three test sets selected from the FSSP database (36) were used as described in Sadreyev and Grishin (34) (data kindly provided by Ruslan Sadreyev), and Edgar and Sjolander (13,35), which we call SG, PP1 and PP2, respectively. These three sets vary mainly in their selection criteria. PP1 and PP2 contain pairs with sequence identity <=30%. PP1 was designed to select pairs that have high structural similarity, requiring a z-score of >=15 and a root mean square deviation (r.m.s.d.) of <=2.5 Å. PP2 selected more diverged pairs with a z-score of >=8 and <=12, and an r.m.s.d. of <=3.5 Å. SG contains pairs sampled from three ranges of sequence identity: 0–15, 15–30 and 30–97%, with no z-score or r.m.s.d. limits. We re-aligned each pair of structures using the CE aligner (29), and retained only those pairs for which FSSP and CE agreed on 50 or more positions. This was designed to minimize questionable and ambiguous structural alignments as done in SABmark and MaxBench (33). We used the full-chain sequence of each structure to make a PSI-BLAST (37,38) search of the NCBI non-redundant protein sequence database (39), keeping locally aligned regions of hits with e-values below 0.01. Hits were filtered to 80% maximum identity (including the query), and 24 selected at random. Finally, each pair of structures and their remaining hits were combined to make sets of <=50 sequences. The limit of 50 was arbitrarily chosen to make the test tractable on a desktop computer for some of the more resource-intensive methods, in particular T-Coffee (which needed 10 CPU days, as noted in Table 4). The final set, PREFAB version 3.0, has 1932 alignments averaging 49 sequences of length 240, of which 178 positions in the structure pair are found in the consensus of FSSP and CE.


View this table:
[in this window]
[in a new window]
 
Table 4. Q scores and times on PREFAB
 
Accuracy measurement
We used three accuracy measures: Q, TC and APDB. Q (quality) is the number of correctly aligned residue pairs divided by the number of residue pairs in the reference alignment. This has previously been termed the developer score (32) and SPS (40). TC (total column score) is the number of correctly aligned columns divided by the number of columns in the reference alignment; this is Thompson et al.’s CS and is equivalent to Q in the case of two sequences (as in PREFAB). APDB (41) is derived from structures alone; no reference alignment of the sequences or structures is needed. For BAliBASE, we use Q and TC, measured only on core blocks as annotated in the database. For PREFAB, we use Q, including only those positions on which CE and FSSP agree, and also APDB. For SMART, we use Q and TC computed for all columns. For SABmark, we average the Q score over each pair of sequences. TC score is not applicable to SABmark as the reference alignments are pairwise.

Statistical analysis
Following Thompson et al. (19), statistical significance is measured by a Friedman rank test (42), which is more conservative than the Wilcoxon test that has also been used for alignment accuracy discrimination (5,7,8) as fewer assumptions are made about the population distribution. In particular, the Wilcoxon test assumes a symmetrical difference between two methods, but in practice we sometimes observe a significant skew. PREFAB and SABmark use automated structure alignment methods, which sometimes produce questionable results. Many low-quality regions are eliminated by taking the consensus between two independent aligners, but some may remain. In PREFAB, assessment of a multiple alignment is made on a single pair of sequences, which may be more or less accurately aligned than the average over all pairs. In SABmark, the upper bound on Q is less than 1 to a varying degree because the pairwise reference alignments may not be mutually consistent. These effects can be viewed as introducing noise into the experiment, and a single accuracy measurement may be subject to error. However, as the structural aligners do not use primary sequence, these errors are unbiased with respect to sequence methods. A difference in accuracy between two sequence alignment methods can therefore be established by the Friedman test, and the measured difference in average accuracy will be approximately correct when measured over a sufficient number of samples.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MUSCLE algorithm
 Assessment
 RESULTS
 DISCUSSION
 REFERENCES
 
Quality scores and CPU times are summarized in Tables 1234567; rankings and statistical significance on PREFAB and BAliBASE for all pairs of methods are given in Table 8. On all test sets and quality measures, MUSCLE achieves the highest ranking (in some cases jointly with other methods due to lack of statistical significance), and MUSCLE-p is statistically indistinguishable from T-Coffee and NWNSI. MUSCLE achieves the highest BAliBASE score reported to date, but the improvement of 1.6% in Q and 2.2% in TC over T-Coffee has low significance (P = 0.15). A similar result is found on SABmark, where MUSCLE achieves a 1.5% improvement over T-Coffee in Q with P = 0.14. The Q score on PREFAB is best able to distinguish between methods, giving statistically significant rankings to MUSCLE > MUSCLE-p, MUSCLE > T-Coffee, MUSCLE > NWNSI and MUSCLE-p > NWNSI. SMART also ranks MUSCLE highest. SMART cannot be considered definitive due to the use of sequence methods in construction of the database, although any bias from this source is likely to favor methods that were available to the SMART developers (i.e. to be against MUSCLE). The SMART results could be interpreted as suggesting that MUSCLE alignments are more consistent with refinements made by human experts. The APDB score appears to be relatively insensitive, showing no significant improvement due to the refinement stage of MUSCLE (similarly for MAFFT; not shown), and is not able to distinguish between the four highest scoring methods. We speculate that the scatter observed in the correlation between APDB and more conventional measures such as TC (40) injects sufficient noise to obscure meaningful differences in accuracy that can be resolved using Q. The low rank of Align-m on SABmark differs from results quoted by Van Walle et al. (21), who assessed pairwise alignments produced by an intermediate step in the algorithm, whereas we used the final multiple alignment.


View this table:
[in this window]
[in a new window]
 
Table 1. BAliBASE scores and times
 

View this table:
[in this window]
[in a new window]
 
Table 2. BAliBASE Q scores on subsets
 

View this table:
[in this window]
[in a new window]
 
Table 3. BAliBASE TC scores on subsets
 

View this table:
[in this window]
[in a new window]
 
Table 5. APDB scores on PREFAB
 

View this table:
[in this window]
[in a new window]
 
Table 6. Q scores and CPU times on SABmark
 

View this table:
[in this window]
[in a new window]
 
Table 7. Q and TC scores on SMART
 

View this table:
[in this window]
[in a new window]
 
Table 8. Ranks and statistical significance on BAliBASE and PREFAB
 
Resource requirements for large numbers of sequences
To investigate resource requirements for increasing number of sequences N, we used the Rose sequence generator (43) (complete results not shown). In agreement with other studies, [e.g. Katoh et al. (8)], we found that T-Coffee was unable to align more than approximately 102 sequences of typical length on a current desktop computer. CLUSTALW was able to align a few hundred sequences, with a practical limit around N = 103 where CPU time begins to scale approximately as N4. The largest set had 5000 sequences of average length 350. MUSCLE-p completed this test in 7 min, compared with 10 min for FFTNS1; we estimate that CLUSTALW would need approximately 1 year.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MUSCLE algorithm
 Assessment
 RESULTS
 DISCUSSION
 REFERENCES
 
We have described a new multiple sequence alignment algorithm, MUSCLE, and presented evidence that it creates alignments with average accuracy comparable with or superior to the best current methods. It should be emphasized that performance differences between the better methods emerge only when averaged over a large number of test cases, even when alignments are considered trustworthy. For example, on BAliBASE, the lowest scoring of the tested methods (FFTNS1) achieved a higher Q than the highest scoring (MUSCLE) in 21 out of 141 alignments and tied in 19 more; compared with T-Coffee, MUSCLE scored higher or tied in 95 cases, but lower in 24. This suggests the use of multiple algorithms and careful inspection of the results. MUSCLE is comparable in speed with CLUSTALW, completing a test set (PREFAB) averaging 49 sequences of length 240 in about half the time. The progressive method MUSCLE-p, which has average accuracy statistically indistinguishable from T-Coffee and the most accurate MAFFT script, is the fastest algorithm known to the author for large numbers of sequences, able to align 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE software, source code and test data are freely available at: http://www.drive5. com/muscle.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MUSCLE algorithm
 Assessment
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Wang,L. and Jiang,T. (1994) On the complexity of multiple sequence alignment. J. Comput. Biol., 1, 337–348.[Medline]

  2. Waterman,M.S., Smith,T.F. and Beyer,W.A. (1976) Some biological sequence metrics. Adv. Math., 20, 367–387.[CrossRef]

  3. Hogeweg,P. and Hesper,B. (1984) The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J. Mol. Evol., 20, 175–186.[CrossRef][Web of Science][Medline]

  4. Feng,D.F. and Doolittle,R.F. (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol., 25, 351–360.[Web of Science][Medline]

  5. Notredame,C., Higgins,D.G. and Heringa,J. (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol., 302, 205–217.[CrossRef][Web of Science][Medline]

  6. Notredame,C. (2002) Recent progress in multiple sequence alignment: a survey. Pharmacogenomics, 3, 131–144.[CrossRef][Web of Science][Medline]

  7. Gotoh,O. (1996) Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J. Mol. Biol., 264, 823–838.[CrossRef][Web of Science][Medline]

  8. Katoh,K., Misawa,K., Kuma,K. and Miyata,T. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res., 30, 3059–3066.[Abstract/Free Full Text]

  9. Edgar,R.C. (2004) Local homology recognition and distance measures in linear time using compressed amino acid alphabets. Nucleic Acids Res., 32, 380–385.[Abstract/Free Full Text]

  10. Kimura,M. (1983) The Neutral Theory of Molecular Evolution. Cambridge University Press.

  11. Sneath,P.H.A. and Sokal,R.R. (1973) Numerical Taxonomy. Freeman, San Francisco.

  12. Saitou,N. and Nei,M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol., 4, 406–425.[Abstract]

  13. Edgar,R.C. and Sjolander,K. (2004) A comparison of scoring functions for protein sequence profile alignment. Bioinformatics, DOI: 10.1093/bioinformatics/bth090.

  14. Sjolander,K., Karplus,K., Brown,M., Hughey,R., Krogh,A., Mian,I.S. and Haussler,D. (1996) Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. CABIOS, 12, 327–345.

  15. von Ohsen,N. and Zimmer,R. (2001) Improving profile–profile alignment via log average scoring. In Gascuel,O. and Moret,B.M.E. (eds), Algorithms in Bioinformatics, First International Workshop, WABI 2001. Springer-Verlag, Berlin, Germany, pp. 11–26.

  16. Muller,T., Spang,R. and Vingron,M. (2002) Estimating amino acid substitution models: a comparison of Dayhoff’s estimator, the resolvent approach and a maximum likelihood method. Mol. Biol. Evol., 19, 8–13.[Abstract/Free Full Text]

  17. Brudno,M., Do,C.B., Cooper,G.M., Kim,M.F., Davydov,E., Green,E.D., Sidow,A. and Batzoglou,S. (2003) LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res., 13, 721–731.[Abstract/Free Full Text]

  18. Hirosawa,M., Totoki,Y., Hoshida,M. and Ishikawa,M. (1995) Comprehensive study on iterative algorithms of multiple sequence alignment. CABIOS, 11, 13–18.

  19. Thompson,J.D., Plewniak,F. and Poch,O. (1999a) BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics, 15, 87–88.[Abstract/Free Full Text]

  20. Bahr,A., Thompson,J.D., Thierry,J.C. and Poch,O. (2001) BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Res., 29, 323–326.[Abstract/Free Full Text]

  21. Van Walle,I., Lasters,I. and Wyns,L. (2004) Align-m—a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics, DOI: 10.1093/bioinformatics/bth116.

  22. Schultz,J., Milpetz,F., Bork,P. and Ponting,C.P. (1998) SMART, a simple modular architecture research tool: identification of signaling domains. Proc. Natl Acad. Sci. USA, 95, 5857–5864.[Abstract/Free Full Text]

  23. Schultz,J., Copley,R.R., Doerks,T., Ponting,C.P. and Bork,P. (2000) SMART: a web-based tool for the study of genetically mobile domains. Nucleic Acids Res., 28, 231–234.[Abstract/Free Full Text]

  24. Ponting,C.P., Schultz,J., Milpetz,F. and Bork,P. (1999) SMART: identification and annotation of domains from signalling and extracellular protein sequences. Nucleic Acids Res., 27, 229–332.[Abstract/Free Full Text]

  25. Thompson,J.D., Higgins,D.G. and Gibson,T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673–4680.[Abstract/Free Full Text]

  26. Brenner,S.E., Koehl,P. and Levitt,M. (2000) The ASTRAL compendium for protein structure and sequence analysis. Nucleic Acids Res., 28, 254–256.[Abstract/Free Full Text]

  27. Murzin,A.G., Brenner,S.E., Hubbard,T. and Chothia,C. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 247, 536–540.[CrossRef][Web of Science][Medline]

  28. Boutonnet,N.S., Rooman,M.J., Ochagavia,M.E., Richelle,J. and Wodak,S.J. (1995) Optimal protein structure alignments by multiple linkage clustering: application to distantly related proteins. Protein Eng., 8, 647–662.[Web of Science][Medline]

  29. Shindyalov,I.N. and Bourne,P.E. (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng., 11, 739–747.[Abstract/Free Full Text]

  30. Fischer,D., Barret,C., Bryson,K., Elofsson,A., Godzik,A., Jones,D., Karplus,K.J., Kelley,L.A., MacCallum,R.M., Pawowski,K., Rost,B., Rychlewski,L. and Sternberg,M. (1999) CAFASP-1: critical assessment of fully automated structure prediction methods. Proteins, Suppl. 3, 209–217.

  31. Eidhammer,I., Jonassen,I. and Taylor,W.R. (2000) Structure comparison and structure patterns. J. Comput. Biol., 7, 685–716.[CrossRef][Web of Science][Medline]

  32. Sauder,J.M., Arthur,J.W. and Dunbrack,R.L.,Jr (2000) Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins, 40, 6–22.[CrossRef][Web of Science][Medline]

  33. Leplae,R. and Hubbard,T.J. (2002) MaxBench: evaluation of sequence and structure comparison methods. Bioinformatics, 18, 494–495.[Abstract/Free Full Text]

  34. Sadreyev,R. and Grishin,N. (2003) COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. J. Mol. Biol., 326, 317–336.[CrossRef][Web of Science][Medline]

  35. Edgar,R.C. and Sjolander,K. (2004) COACH: profile-profile alignment of protein families using hidden Markov models. Bioinformatics, DOI: 10.1093/bioinformatics/bth091.

  36. Holm,L. and Sander,C. (1998) Touring protein fold space with Dali/FSSP. Nucleic Acids Res., 26, 316–319.[Abstract/Free Full Text]

  37. Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402.[Abstract/Free Full Text]

  38. Schaffer,A.A., Aravind,L., Madden,T.L., Shavirin,S., Spouge,J.L., Wolf,Y.I., Koonin,E.V. and Altschul,S.F. (2001) Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res., 29, 2994–3005.[Abstract/Free Full Text]

  39. Pruitt,K.D., Tatusova,T. and Maglott,D.R. (2003) NCBI Reference Sequence project: update and current status. Nucleic Acids Res., 31, 34–37.[Abstract/Free Full Text]

  40. Thompson,J.D., Plewniak,F. and Poch,O. (1999b) A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res., 27, 2682–2690.[Abstract/Free Full Text]

  41. O’Sullivan,O., Zehnder,M., Higgins,D., Bucher,P., Grosdidier,A. and Notredame,C. (2003) APDB: a novel measure for benchmarking sequence alignment methods without reference alignments. Bioinformatics, 19 Suppl. 1, I215–I221.

  42. Friedman,M. (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc., 32, 675–701.[CrossRef][Web of Science]

  43. Stoye,J., Evers,D. and Meyer,F. (1998) Rose: generating sequence families. Bioinformatics, 14, 157–163.[Abstract/Free Full Text]

  44. Sjolander,K. (1998) Phylogenetic inference in protein superfamilies: analysis of SH2 domains. Proc. Int. Conf. Intell. Syst. Mol. Biol., 6, 165–174.[Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Proc R Soc BHome page
J. R. Powell, J. L. Parrent, M. M. Hart, J. N. Klironomos, M. C. Rillig, and H. Maherali
Phylogenetic trait conservatism and the evolution of functional trade-offs in arbuscular mycorrhizal fungi
Proc R Soc B, December 7, 2009; 276(1676): 4237 - 4245.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
X. Hu, A. V. Karasev, C. J. Brown, and J. H. Lorenzen
Sequence characteristics of potato virus Y recombinants
J. Gen. Virol., December 1, 2009; 90(12): 3033 - 3041.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
J. Dorscht, J. Klumpp, R. Bielmann, M. Schmelcher, Y. Born, M. Zimmer, R. Calendar, and M. J. Loessner
Comparative Genome Analysis of Listeria Bacteriophages Reveals Extensive Mosaicism, Programmed Translational Frameshifting, and a Novel Prophage Insertion Site
J. Bacteriol., December 1, 2009; 191(23): 7206 - 7215.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
U. Krauss, B. Q. Minh, A. Losi, W. Gartner, T. Eggert, A. von Haeseler, and K.-E. Jaeger
Distribution and Phylogeny of Light-Oxygen-Voltage-Blue-Light-Signaling Proteins in the Three Kingdoms of Life
J. Bacteriol., December 1, 2009; 191(23): 7234 - 7242.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
F. O. P. Stefani, J.-M. Moncalvo, A. Seguin, J. A. Berube, and R. C. Hamelin
Impact of an 8-Year-Old Transgenic Poplar Plantation on the Ectomycorrhizal Fungal Community
Appl. Envir. Microbiol., December 1, 2009; 75(23): 7527 - 7536.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
R. J. Grainger, J. D. Barrass, A. Jacquier, J.-C. Rain, and J. D. Beggs
Physical and genetic interactions of yeast Cwc21p, an ortholog of human SRm300/SRRM2, suggest a role at the catalytic center of the spliceosome
RNA, December 1, 2009; 15(12): 2161 - 2173.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
P. Le, P. R. Fisher, and C. Barth
Transcription of the Dictyostelium discoideum mitochondrial genome occurs from a single initiation site
RNA, December 1, 2009; 15(12): 2321 - 2330.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
A. D. Leache
Species Tree Discordance Traces to Phylogeographic Clade Boundaries in North American Fence Lizards (Sceloporus)
Syst Biol, December 1, 2009; 58(6): 547 - 559.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
Y. Bulliard, P. Turelli, U. F. Rohrig, V. Zoete, B. Mangeat, O. Michielin, and D. Trono
Functional Analysis and Structural Modeling of Human APOBEC3G Reveal the Role of Evolutionarily Conserved Elements in the Inhibition of Human Immunodeficiency Virus Type 1 Infection and Alu Transposition
J. Virol., December 1, 2009; 83(23): 12611 - 12621.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Blouin, S. Perry, A. Lavell, E. Susko, and A. J. Roger
Reproducing the manual annotation of multiple sequence alignments using a SVM classifier
Bioinformatics, December 1, 2009; 25(23): 3093 - 3098.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
P. Deschamps and D. Moreira
Signal Conflicts in the Phylogeny of the Primary Photosynthetic Eukaryotes
Mol. Biol. Evol., December 1, 2009; 26(12): 2745 - 2753.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Ortiz, N. Guex, E. Patin, O. Martin, I. Xenarios, A. Ciuffi, L. Quintana-Murci, and A. Telenti
Evolutionary Trajectories of Primate Genes Involved in HIV Pathogenesis
Mol. Biol. Evol., December 1, 2009; 26(12): 2865 - 2875.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Marz, A. Donath, N. Verstraete, V. T. Nguyen, P. F. Stadler, and O. Bensaude
Evolution of 7SK RNA and Its Protein Partners in Metazoa
Mol. Biol. Evol., December 1, 2009; 26(12): 2821 - 2830.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
K. Mukherjee, L. Brocchieri, and T. R. Burglin
A Comprehensive Classification and Evolutionary Analysis of Plant Homeobox Genes
Mol. Biol. Evol., December 1, 2009; 26(12): 2775 - 2794.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
B. Bentlage, P. Cartwright, A. A. Yanagihara, C. Lewis, G. S. Richards, and A. G. Collins
Evolution of box jellyfish (Cnidaria: Cubozoa), a group of highly toxic invertebrates
Proc R Soc B, November 25, 2009; (2009) rspb.2009.1707v2.
[Abstract] [Full Text] [PDF]


Home page
Gen Biol EvolHome page
C.-H. Chen, T.-J. Chuang, B.-Y. Liao, and F.-C. Chen
Scanning for the Signatures of Positive Selection for Human-Specific Insertions and Deletions
Gen Biol Evol, November 23, 2009; 2009(0): 415 - 419.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
R. C. Thomson and H. B. Shaffer
Sparse Supermatrices for Phylogenetic Inference: Taxonomy, Alignment, Rogue Taxa, and the Phylogeny of Living Turtles
Syst Biol, November 11, 2009; (2009) syp075v1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Muller, D. Szklarczyk, P. Julien, I. Letunic, A. Roth, M. Kuhn, S. Powell, C. von Mering, T. Doerks, L. J. Jensen, et al.
eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations
Nucleic Acids Res., November 9, 2009; (2009) gkp951v1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. R. Jex, R. S. Hall, D. T. J. Littlewood, and R. B. Gasser
An integrated pipeline for next-generation sequencing and annotation of mitochondrial genomes
Nucleic Acids Res., November 5, 2009; (2009) gkp883v1.
[Abstract] [Full Text] [PDF]


Home page
Gen Biol EvolHome page
M. S. Barker, H. Vogel, and M. E. Schranz
Paleopolyploidy in the Brassicales: Analyses of the Cleome Transcriptome Elucidate the History of Genome Duplications in Arabidopsis and Other Brassicales
Gen Biol Evol, November 3, 2009; 2009(0): 391 - 399.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
W. J. Kress, D. L. Erickson, F. A. Jones, N. G. Swenson, R. Perez, O. Sanjur, and E. Bermingham
Plant DNA barcodes and a community phylogeny of a tropical forest dynamics plot in Panama
PNAS, November 3, 2009; 106(44): 18621 - 18626.
[Abstract] [Full Text] [PDF]


Home page
jvdiHome page
W. C. Wilson, B. J. Hindson, E. S. O'Hearn, S. Hall, C. Tellgren-Roth, C. Torres, P. Naraghi-Arani, J. O. Mecham, and R. J. Lenhoff
A multiplex real-time reverse transcription polymerase chain reaction assay for detection and differentiation of Bluetongue virus and Epizootic hemorrhagic disease virus serogroups
J Vet Diagn Invest, November 1, 2009; 21(6): 760 - 770.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
M. D. Wilkerson, Y. Ru, and V. P. Brendel
Common introns within orthologous genes: software and application to plants
Brief Bioinform, November 1, 2009; 10(6): 631 - 644.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
S. Cook, G. Moureau, R. E. Harbach, L. Mukwaya, K. Goodger, F. Ssenfuka, E. Gould, E. C. Holmes, and X. de Lamballerie
Isolation of a novel species of flavivirus and a new strain of Culex flavivirus (Flaviviridae) from a natural mosquito population in Uganda
J. Gen. Virol., November 1, 2009; 90(11): 2669 - 2678.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
J. Giacomotto, C. Pertl, C. Borrel, M. C. Walter, S. Bulst, B. Johnsen, D. L. Baillie, H. Lochmuller, C. Thirion, and L. Segalat
Evaluation of the therapeutic potential of carbonic anhydrase inhibitors in two animal models of dystrophin deficient muscular dystrophy
Hum. Mol. Genet., November 1, 2009; 18(21): 4089 - 4101.
[Abstract] [Full Text] [PDF]


Home page
J MOLLUS STUDHome page
D. Carstensen, J. Laudien, F. Leese, W. Arntz, and C. Held
Genetic variability, shell and sperm morphology suggest that the surf clams Donax marincovichi and D. obesulus are one species
J. Mollus. Stud., November 1, 2009; 75(4): 381 - 390.
[Abstract] [Full Text] [PDF]


Home page
Mol PlantHome page
M. E. Rumpho, S. Pochareddy, J. M. Worful, E. J. Summer, D. Bhattacharya, K. N. Pelletreau, M. S. Tyler, J. Lee, J. R. Manhart, and K. M. Soule
Molecular Characterization of the Calvin Cycle Enzyme Phosphoribulokinase in the Stramenopile Alga Vaucheria litorea and the Plastid Hosting Mollusc Elysia chlorotica
Mol Plant, November 1, 2009; 2(6): 1384 - 1396.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. M. Flowers, Y. Hanzawa, M. C. Hall, R. C. Moore, and M. D. Purugganan
Population Genomics of the Arabidopsis thaliana Flowering Time Gene Network
Mol. Biol. Evol., November 1, 2009; 26(11): 2475 - 2486.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
M. Martinez, I. Cambra, L. Carrillo, M. Diaz-Mendoza, and I. Diaz
Characterization of the Entire Cystatin Gene Family in Barley and Their Target Cathepsin L-Like Cysteine-Proteases, Partners in the Hordein Mobilization during Seed Germination
Plant Physiology, November 1, 2009; 151(3): 1531 - 1545.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
A. De Wever, F. Leliaert, E. Verleyen, P. Vanormelingen, K. Van der Gucht, D. A. Hodgson, K. Sabbe, and W. Vyverman
Hidden levels of phylodiversity in Antarctic green algae: further evidence for the existence of glacial refugia
Proc R Soc B, October 22, 2009; 276(1673): 3591 - 3599.
[Abstract] [Full Text] [PDF]


Home page
Gen Biol EvolHome page
E. Desmond and S. Gribaldo
Phylogenomics of Sterol Synthesis: Insights into the Origin, Evolution, and Diversity of a Key Eukaryotic Feature
Gen Biol Evol, October 20, 2009; 2009(0): 364 - 381.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. S. Hawkins, S. R. Proulx, R. A. Rapp, and J. F. Wendel
Rapid DNA loss as a counterbalance to genome expansion through retrotransposon proliferation in plants
PNAS, October 20, 2009; 106(42): 17811 - 17816.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
E. Blasingame, T. Tuton-Blasingame, L. Larkin, A. M. Falick, L. Zhao, J. Fong, V. Vaidyanathan, A. Visperas, P. Geurts, X. Hu, et al.
Pyriform Spidroin 1, a Novel Member of the Silk Gene Family That Anchors Dragline Silk Fibers in Attachment Discs of the Black Widow Spider, Latrodectus hesperus
J. Biol. Chem., October 16, 2009; 284(42): 29097 - 29108.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Jorda and A. V. Kajava
T-REKS: identification of Tandem REpeats in sequences with a K-meanS based algorithm
Bioinformatics, October 15, 2009; 25(20): 2632 - 2638.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
J. R. R. Whittle and T. U. Schwartz
Architectural Nucleoporins Nup157/170 and Nup133 Are Structurally Related and Descend from a Second Ancestral Element
J. Biol. Chem., October 9, 2009; 284(41): 28442 - 28452.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
E. Kazancioglu, T. J. Near, R. Hanel, and P. C. Wainwright
Influence of sexual selection and feeding functional morphology on diversification rate of parrotfishes (Scaridae)
Proc R Soc B, October 7, 2009; 276(1672): 3439 - 3446.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
S. Saha, K. H. Biswas, C. Kondapalli, N. Isloor, and S. S. Visweswariah
The Linker Region in Receptor Guanylyl Cyclases Is a Key Regulatory Module: MUTATIONAL ANALYSIS OF GUANYLYL CYCLASE C
J. Biol. Chem., October 2, 2009; 284(40): 27135 - 27145.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
M. Rastgou, M. K. Habibi, K. Izadpanah, V. Masenga, R. G. Milne, Y. I. Wolf, E. V. Koonin, and M. Turina
Molecular characterization of the plant virus genus Ourmiavirus and evidence of inter-kingdom reassortment of viral genome segments as its possible route of origin
J. Gen. Virol., October 1, 2009; 90(10): 2525 - 2535.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
R. A. Valverde and S. Sabanadzovic
A novel plant virus with unique properties infecting Japanese holly fern
J. Gen. Virol., October 1, 2009; 90(10): 2542 - 2549.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
D. S. Smyth and D. A. Robinson
Integrative and Sequence Characteristics of a Novel Genetic Element, ICE6013, in Staphylococcus aureus
J. Bacteriol., October 1, 2009; 191(19): 5964 - 5975.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Schwarz, P. N. Seibel, S. Rahmann, C. Schoen, M. Huenerberg, C. Muller-Reible, T. Dandekar, R. Karchin, J. Schultz, and T. Muller
Detecting species-site dependencies in large multiple sequence alignments
Nucleic Acids Res., October 1, 2009; 37(18): 5959 - 5968.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. N. Gardner, A. L. Hiddessen, P. L. Williams, C. Hara, M. C. Wagner, and B. W. Colston Jr
Multiplex primer prediction software for divergent targets
Nucleic Acids Res., October 1, 2009; 37(19): 6291 - 6304.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
L. Leclere, P. Schuchert, C. Cruaud, A. Couloux, and M. Manuel
Molecular Phylogenetics of Thecata (Hydrozoa, Cnidaria) Reveals Long-Term Maintenance of Life History Traits despite High Frequency of Recent Character Changes
Syst Biol, October 1, 2009; 58(5): 509 - 526.
[Abstract] [Full Text] [PDF]


Home page
ICES J. Mar. Sci.Home page
K. M. Hamilton, P. W. Shaw, and D. Morritt
Prevalence and seasonality of Hematodinium (Alveolata: Syndinea) in a Scottish crustacean community
ICES J. Mar. Sci., October 1, 2009; 66(9): 1837 - 1845.
[Abstract] [Full Text] [PDF]


Home page
Hum ReprodHome page
E. Heytens, J. Parrington, K. Coward, C. Young, S. Lambrecht, S.-Y. Yoon, R.A. Fissore, R. Hamer, C.M. Deane, M. Ruas, et al.
Reduced amounts and abnormal forms of phospholipase C zeta (PLC{zeta}) in spermatozoa from infertile men
Hum. Reprod., October 1, 2009; 24(10): 2417 - 2428.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
C. Payen, G. Fischer, C. Marck, C. Proux, D. J. Sherman, J.-Y. Coppee, M. Johnston, B. Dujon, and C. Neuveglise
Unusual composition of a yeast chromosome arm is associated with its delayed replication
Genome Res., October 1, 2009; 19(10): 1710 - 1721.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
T. J. Sharpton, J. E. Stajich, S. D. Rounsley, M. J. Gardner, J. R. Wortman, V. S. Jordar, R. Maiti, C. D. Kodira, D. E. Neafsey, Q. Zeng, et al.
Comparative genomic analyses of the human fungal pathogens Coccidioides and their relatives
Genome Res., October 1, 2009; 19(10): 1722 - 1731.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Kemena and C. Notredame
Upcoming challenges for multiple sequence alignment methods in the high-throughput era
Bioinformatics, October 1, 2009; 25(19): 2455 - 2465.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
H. Shan, L. Zahn, S. Guindon, P. K. Wall, H. Kong, H. Ma, C. W. dePamphilis, and J. Leebens-Mack
Evolution of Plant MADS Box Transcription Factors: Evidence for Shifts in Selection Associated with Early Angiosperm Diversification and Concerted Gene Duplications
Mol. Biol. Evol., October 1, 2009; 26(10): 2229 - 2244.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Turmel, C. Otis, and C. Lemieux
The Chloroplast Genomes of the Green Algae Pedinomonas minor, Parachlorella kessleri, and Oocystis solitaria Reveal a Shared Ancestry between the Pedinomonadales and Chlorellales
Mol. Biol. Evol., October 1, 2009; 26(10): 2317 - 2331.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
B. Pils and A. Heyl
Unraveling the Evolution of Cytokinin Signaling
Plant Physiology, October 1, 2009; 151(2): 782 - 791.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
W. Hao and J. D. Palmer
Fine-scale mergers of chloroplast and mitochondrial genes create functional, transcompartmentally chimeric mitochondrial genes
PNAS, September 29, 2009; 106(39): 16728 - 16733.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. van den Born, A. Bekkelund, M. N. Moen, M. V. Omelchenko, A. Klungland, and P. O. Falnes
Bioinformatics and functional analysis define four distinct groups of AlkB DNA-dioxygenases in bacteria
Nucleic Acids Res., September 28, 2009; (2009) gkp774v1.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
L. J. Holt, B. B. Tuch, J. Villen, A. D. Johnson, S. P. Gygi, and D. O. Morgan
Global Analysis of Cdk1 Substrate Phosphorylation Sites Provides Insights into Evolution
Science, September 25, 2009; 325(5948): 1682 - 1686.
[Abstract] [Full Text] [PDF]


Home page
Gen Biol EvolHome page
C. C. Weber and L. D. Hurst
Protein Rates of Evolution Are Predicted by Double-Strand Break Events, Independent of Crossing-over Rates
Gen Biol Evol, September 23, 2009; 2009(0): 340 - 349.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
H.-Y. Chang, J. Hemp, Y. Chen, J. A. Fee, and R. B. Gennis
The cytochrome ba3 oxygen reductase from Thermus thermophilus uses a single input channel for proton delivery to the active site and for proton pumping
PNAS, September 22, 2009; 106(38): 16169 - 16173.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
V. E. Mayer and H. Voglmayr
Mycelial carton galleries of Azteca brevis (Formicidae) as a multi-species network
Proc R Soc B, September 22, 2009; 276(1671): 3265 - 3273.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
D. A. Caron, P. D. Countway, P. Savai, R. J. Gast, A. Schnetzer, S. D. Moorthi, M. R. Dennett, D. M. Moran, and A. C. Jones
Defining DNA-Based Operational Taxonomic Units for Microbial-Eukaryote Ecology
Appl. Envir. Microbiol., September 15, 2009; 75(18): 5797 - 5808.
[Abstract] [Full Text] [PDF]


Home page
Gen Biol EvolHome page
J. B. W. Wolf, A. Kunstner, K. Nam, M. Jakobsson, and H. Ellegren
Nonlinear Dynamics of Nonsynonymous (dN) and Synonymous (dS) Substitution Rates Affects Inference of Selection
Gen Biol Evol, September 4, 2009; 2009(0): 308 - 319.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
M. Kjos, I. F. Nes, and D. B. Diep
Class II one-peptide bacteriocins target a phylogenetically defined subgroup of mannose phosphotransferase systems on sensitive cells
Microbiology, September 1, 2009; 155(9): 2949 - 2961.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
P. T. Beernink and D. M. Granoff
The modular architecture of meningococcal factor H-binding protein
Microbiology, September 1, 2009; 155(9): 2873 - 2883.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Oberto, N. Breuil, A. Hecker, F. Farina, C. Brochier-Armanet, E. Culetto, and P. Forterre
Qri7/OSGEPL, the mitochondrial version of the universal Kae1/YgjD protein, is essential for mitochondrial genome maintenance
Nucleic Acids Res., September 1, 2009; 37(16): 5343 - 5352.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Savic, J. Lovric, T. I. Tomic, B. Vasiljevic, and G. L. Conn
Determination of the target nucleosides for members of two families of 16S rRNA methyltransferases that confer resistance to partially overlapping groups of aminoglycoside antibiotics
Nucleic Acids Res., September 1, 2009; 37(16): 5420 - 5431.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y.-Q. Shen, B. F. Lang, and G. Burger
Diversity and dispersal of a ubiquitous protein family: acyl-CoA dehydrogenases
Nucleic Acids Res., September 1, 2009; 37(17): 5619 - 5631.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
S. Chun and J. C. Fay
Identification of deleterious mutations within three human genomes
Genome Res., September 1, 2009; 19(9): 1553 - 1561.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
R. Reinheimer and E. A. Kellogg
Evolution of AGL6-like MADS Box Genes in Grasses (Poaceae): Ovule Expression Is Ancient and Palea Expression Is New
PLANT CELL, September 1, 2009; 21(9): 2591 - 2605.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
S.-K. Oh, C. Young, M. Lee, R. Oliva, T. O. Bozkurt, L. M. Cano, J. Win, J. I.B. Bos, H.-Y. Liu, M. van Damme, et al.
In Planta Expression Screens of Phytophthora infestans RXLR Effectors Reveal Diverse Phenotypes, Including Activation of the Solanum bulbocastanum Disease Resistance Protein Rpi-blb2
PLANT CELL, September 1, 2009; 21(9): 2928 - 2947.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Roettger, W. Martin, and T. Dagan
A Machine-Learning Approach Reveals That Alignment Properties Alone Can Accurately Predict Inference of Lateral Gene Transfer from Discordant Phylogenies
Mol. Biol. Evol., September 1, 2009; 26(9): 1931 - 1939.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Csuros and I. Miklos
Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model
Mol. Biol. Evol., September 1, 2009; 26(9): 2087 - 2095.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
A. Kianianmomeni, K. Stehfest, G. Nematollahi, P. Hegemann, and A. Hallmann
Channelrhodopsins of Volvox carteri Are Photochromic Proteins That Are Specifically Expressed in Somatic Cells under Control of Light, Temperature, and the Sex Inducer
Plant Physiology, September 1, 2009; 151(1): 347 - 366.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
Y. Fong, J. Wakefield, and K. Rice
Bayesian mixture modeling using a hybrid sampler with application to protein subfamily identification
Biostat., August 20, 2009; (2009) kxp033v1.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
J. Wan, R. Griffiths, J. Ying, P. McCourt, and Y. Huang
Development of Drought-Tolerant Canola (Brassica napus L.) through Genetic Modulation of ABA-mediated Stomatal Responses
Crop Sci., August 7, 2009; 49(5): 1539 - 1554.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
T. Yang, L. Bar-Peled, L. Gebhart, S. G. Lee, and M. Bar-Peled
Identification of Galacturonic Acid-1-phosphate Kinase, a New Member of the GHMP Kinase Superfamily in Plants, and Comparison with Galactose-1-phosphate Kinase
J. Biol. Chem., August 7, 2009; 284(32): 21526 - 21535.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
H. Liu, I. Probert, J. Uitz, H. Claustre, S. Aris-Brosou, M. Frada, F. Not, and C. de Vargas
Extreme diversity in noncalcifying haptophytes explains a major pigment paradox in open oceans
PNAS, August 4, 2009; 106(31): 12803 - 12808.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. L. Adelson, J. M. Raison, and R. C. Edgar
Characterization and distribution of retrotransposons and simple sequence repeats in the bovine genome
PNAS, August 4, 2009; 106(31): 12855 - 12860.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
J. Zielonka, I. G. Bravo, D. Marino, E. Conrad, M. Perkovic, M. Battenberg, K. Cichutek, and C. Munk
Restriction of Equine Infectious Anemia Virus by Equine APOBEC3 Cytidine Deaminases
J. Virol., August 1, 2009; 83(15): 7547 - 7559.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
W. F. Fricke, T. J. Welch, P. F. McDermott, M. K. Mammel, J. E. LeClerc, D. G. White, T. A. Cebula, and J. Ravel
Comparative Genomics of the IncA/C Multidrug Resistance Plasmid Family
J. Bacteriol., August 1, 2009; 191(15): 4750 - 4757.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Z. Wunderlich and L. A. Mirny
Using genome-wide measurements for computational prediction of SH2-peptide interactions
Nucleic Acids Res., August 1, 2009; 37(14): 4629 - 4641.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
M. E. Siddall, F. M. Fontanella, S. C. Watson, S. Kvist, and C. Erseus
Barcoding Bamboozled by Bacteria: Convergence to Metazoan Mitochondrial Primer Targets by Marine Microbes
Syst Biol, August 1, 2009; 58(4): 445 - 451.
[Full Text] [PDF]


Home page
Syst BiolHome page
N. C. Sheffield, H. Song, S. L. Cameron, and M. F. Whiting
Nonstationary Evolution and Compositional Heterogeneity in Beetle Mitochondrial Phylogenomics
Syst Biol, August 1, 2009; 58(4): 381 - 394.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
C.-H. Kuo, N. A. Moran, and H. Ochman
The consequences of genetic drift for bacterial genome complexity
Genome Res., August 1, 2009; 19(8): 1450 - 1454.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
F. Zhao, J. Qi, and S. C. Schuster
Tracking the past: Interspersed repeats in an extinct Afrotherian mammal, Mammuthus primigenius
Genome Res., August 1, 2009; 19(8): 1384 - 1392.
[Abstract] [Full Text] [PDF]


Home page
ANN BOT (LOND)Home page
B. Jacobs, F. Lens, and E. Smets
Evolution of fruit and seed characters in the Diervilla and Lonicera clades (Caprifoliaceae, Dipsacales)
Ann. Bot., August 1, 2009; 104(2): 253 - 276.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. H. Fong and A. Marchler-Bauer
CORAL: aligning conserved core regions across domain families
Bioinformatics, August 1, 2009; 25(15): 1862 - 1868.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. F. Neuwald
Rapid detection, classification and accurate alignment of up to a million or more related protein sequences
Bioinformatics, August 1, 2009; 25(15): 1869 - 1875.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Gouveia-Oliveira, F. S. Roque, R. Wernersson, T. Sicheritz-Ponten, P. W. Sackett, A. Molgaard, and A. G. Pedersen
InterMap3D: predicting and visualizing co-evolving protein residues
Bioinformatics, August 1, 2009; 25(15): 1963 - 1965.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
W. Hao and G. B. Golding
Does Gene Translocation Accelerate the Evolution of Laterally Transferred Genes?
Genetics, August 1, 2009; 182(4): 1365 - 1375.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
S. T. Mugford, X. Qi, S. Bakht, L. Hill, E. Wegel, R. K. Hughes, K. Papadopoulou, R. Melton, M. Philo, F. Sainsbury, et al.
A Serine Carboxypeptidase-Like Acyltransferase Is Required for Synthesis of Antimicrobial Compounds and Disease Resistance in Oats
PLANT CELL, August 1, 2009; 21(8): 2473 - 2484.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
C. Staiger, A. Hinneburg, and R. B. Klosgen
Diversity in Degrees of Freedom of Mitochondrial Transit Peptides
Mol. Biol. Evol., August 1, 2009; 26(8): 1773 - 1780.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. R.P.M. Strating, N. H.M. van Bakel, J. A.M. Leunissen, and G. J.M. Martens
A Comprehensive Overview of the Vertebrate p24 Family: Identification of a Novel Tissue-Specifically Expressed Member
Mol. Biol. Evol., August 1, 2009; 26(8): 1707 - 1714.
[Abstract] [Full Text] [PDF]


Home page
Gen Biol EvolHome page
R. P. Meisel, M. V. Han, and M. W. Hahn
A Complex Suite of Forces Drives Gene Traffic from Drosophila X Chromosomes
Gen Biol Evol, July 31, 2009; 2009(0): 176 - 188.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (134K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1411)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Edgar, R. C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Edgar, R. C.
Related Collections
Right arrow Computational methods
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?