Talk:Ancestral reconstruction

From PLoSWiki
Jump to: navigation, search

Response to Review

General Responses: We are extremely grateful for the input provided by the reviewers and editors of our topic page and appreciate the time it takes to do reviews. Both reviewers provided extremely useful comments resulting in additions and improvements to the page the result is, we think, a much improved page.

We have rewritten and revised our page substantially in line with the comments and suggestions provided.

Below is an itemized response to the comments provided by the topic page reviewers. Our responses are in bold text set off by *** and revised portions of the topic page text are in blue text, enclosed in "".


Reviewer 1: Ross Mounce

This is the version of the page I reviewed:

This page describes ancestral reconstruction - quite a broad topic. I congratulate the ambition of the authors on attempting to tackle such a wide-ranging subject. What is there, is good. This will form part of a valuable and significant addition to the English language Wikipedia. However, I am concerned with one or two otherwise missing things from this article and most of the following review concerns these.

My suggestions for changes/additions:


Whilst I see this as major in importance - it does not require much writing to fix; worry not.

I would suggest that the authors broaden their definition of ancestral reconstruction, slightly. I see ancestral (state) reconstruction first and foremost as a method. Taking this view, it’s very important to point out that this method can be usefully applied outside of the biological domain, even if most of its usage is inside the biological domain. In linguistics they definitely use ancestral reconstruction methods to infer otherwise unknown ‘proto-words’, grammar and language structure. Likewise I see no reason why ancestral reconstruction might not be usefully applied to astrocladistics or to find the most-likely ancestral storyline features of Little Red Riding Hood (

This would just require the Introduction and History section to be changed. Perhaps one linguistic example would be good for the applications section too.

***A great suggestion, accordingly we have amended the topic page introduction and history section to reflect this and also included an example from linguistics along with new references to non-biological aspects and uses of ancestral reconstruction.

“Non-biological applications include the reconstruction of the vocabulary or phenomes of ancient languages[1], and cultural characteristics of ancient societies such as oral traditions[2] or marriage practices[3].”

“Linguistic Evolution Reconstructions of the words and phenomes of ancient [wp:Proto-language|proto-languages]] such as Proto-Indo-European have been performed based on the observed analogues in present-day languages. Typically these analyses are carried out manually using the "comparative method"[67]. First, words from different languages with a common etymology (cognates) are identified in the contemporary languages under study, analogous to the identification of orthologous biological sequences. Second, correspondences between individual sounds in the cognates are identified, a step similar to biological sequence alignment although performed manually. Finally, likely ancestral sounds are hypothesised by manual inspection and various heuristics (such as the fact that most languages have both nasal and non-nasal vowels)[67].”


I would like to see the Applications -> Character Evolution section re-organized a little bit. Perhaps also an explanation of what characters can and can’t accurately be reconstructed?

To demonstrate the variety of ‘characters’ (sensu lato) that can be reconstructed I think it would be good to have subheadings within this category to reflect the true variety of ancestral reconstruction applications that are out there:

  • Linguistic Evolution

an example of the reconstruction of grammar or vocabulary e.g. Dunn et al (2005) Structural Phylogenetics and the Reconstruction of Ancient Language History. Science.

  • Behaviour and Life History Evolution

<example> Diet reconstruction in Galapagos finches

  • Morphological Character Evolution

<example> Mammalian body mass (?)

  • ‘Molecular’ Evolution

<multiple examples> just re-write the examples you already have, emphasising the variety of applications: proteins, viruses, DNA sequences…

All in all I think this would be a much clearer, more logical structure for the article. As it stands ‘Diet reconstruction in Galapagos finches’, ‘Mammalian body mass’ and the rag-bag of examples in ‘Correlated character evolution’ strike me as examples that are ‘textbook’ but they do not fully-demonstrate the breadth of applications (particularly those outside of biology!).

***We agree and have followed the suggestions of the reviewer to restructure this section to better reflect the bigger picture categorization of uses and examples of ancestral state reconstruction and as previously mentioned (above) also added an additional example from linguistics. Please see: for the revised and restructured section.


What about Minimum Evolution (Rzhetsky and Nei, 1992)? I see this as distinct from MP, ML, and Bayesian methods. Phyrex from it’s website purports to use ‘a Minimum Evolution algorithm’ to reconstruct ancestral sequences. Although I see no direct citation to Rzhetsky and Nei (1992) in the paper describing Phyrex (Rossnes et al 2005, BMC Bioinformatics).

***Minimum evolution as defined by the authors of Phyrex ( is an implementation of the algorithm “Fitch Parsimony”, which is already covered in detail in the Maximum Parsimony section of the topic page.


Mentioning Phyrex brings me to my next point. IMO the list of software is rather incomplete. I would like to see these added:

VIP from a methods point of view makes a really interesting contribution to the field of spatial reconstruction - unlike other programs (e.g. DIVA or DEC/Lagrange) it does not need arbitrary predefined areas e.g. N. America / S. America.

***We thank the reviewer for helping us make the list more complete and we have updated the software list following all the reviewers suggestions. Please see: for the very much updated table of software.

I would also like to see a column about license information for each of the programs. This is standard practise for Wikipedia I believe. It is also of practical research-use - I want to know which of the programs is openly-licensed so I can adapt and extend the code. Some of these programs are not provided under an OKD-compliant ‘open’ licence (, so one does not have permission to extend upon the original e.g. Phyrex & PAML. Rmounce (talk) 10:13, 26 May 2014 (PDT)

***This is a superb idea and each program now has an entry for licence type. Please see: to view the additional column. We have also added an additional section entitled “Package descriptions” to provide some needed synthesis concerning the software packages performing ancestral reconstruction and their implementations of various reconstruction methods as follows:

“Package descriptions Molecular evolution The majority of these software packages are designed for analyzing genetic sequence data. For example, PAML [68] is a collection of programs for the phylogenetic analysis of DNA and protein sequence alignments by maximum likelihood. Ancestral reconstruction can be performed using the codeml program. In addition, LAZARUS is a collection of Python scripts that wrap the ancestral reconstruction functions of PAML for batch processing and greater ease-of-use. HyPhy, Mesquite, and MEGA are also software packages for the phylogenetic analysis of sequence data, but are designed to be more modular and customizable. HyPhy [69] implements a joint maximum likelihood method of ancestral sequence reconstruction [4] that can be readily adapted to reconstructing a more generalized range of discrete ancestral character states such as geographic locations by specifying a customized model in its batch language. Mesquite[70] provides ancestral state reconstruction methods for both discrete and continuous characters using both maximum parsimony and maximum likelihood methods. It also provides several visualization tools for interpreting the results of ancestral reconstruction. MEGA [71] is also a modular system but places greater emphasis on ease-of-use than customization of analyses. As of version 5, MEGA allows the user to reconstruct ancestral states using maximum parsimony, maximum likelihood, and empirical Bayes methods[71]. The Bayesian analysis of genetic sequences may confer greater robustness to model misspecification. MrBayes [72] allows inference of ancestral states at ancestral nodes using the full hierarchical Bayesian approach. The PREQUEL program distributed in the PHAST package [73] performs comparative evolutionary genomics using ancestral sequence reconstruction. SIMMAP [74] stochastically maps mutations on phylogenies. BayesTraits [26] analyses of discrete or continuous characters in a Bayesian framework to evaluate models of evolution, reconstruct ancestral states, and detect correlated evolution between pairs of traits. Other character types Other software packages are more oriented towards the analysis of qualitative and quantitative traits (phenotypes). For example, the ape package [75] in the statistical computing environment R also provides methods for ancestral state reconstruction for both discrete and continuous characters through the acefunction, including maximum likelihood. (Note that ace performs reconstruction by computing scaled conditional likelihoods instead of the marginal or joint likelihoods used by other maximum likelihood-based methods for ancestral reconstruction, which may adversely affect the accuracy of reconstruction at nodes other than the root [1].) Phyrex implements a maximum parsimony-based algorithm to reconstruct ancestral gene expression profiles, in addition to a maximum likelihood method for reconstruction ancestral genetic sequences (by wrapping around the baseml function in PAML) [76]. Several software packages also reconstruct phylogeography. BEAST (Bayesian Evolutionary Analysis by Sampling Trees[77]) provides tools for reconstructing ancestral geographic locations from observed sequences annotated with location data using Bayesian MCMC sampling methods. Diversitree [78] is an R package providing methods for ancestral state reconstruction under Mk2 (a continuous time Markov model of binary character evolution [79]) and BiSSE (Binary State Speciation and Extinction) models. Lagrange performs analyses on reconstruction of geographic range evolution on phylogenetic trees [64]. Phylomapper [65] is a statistical framework for estimating historical patterns of gene flow and ancestral geographic locations. RASP [80] infers ancestral state using statistical dispersal-vicariance analysis, Lagrange, Bayes-Lagrange, BayArea and BBM methods. VIP [81] infers historical biogeography by examining disjunct geographic distributions. Genome rearrangements provide valuable information in comparative genomics between species. ANGES [82] compares extant related genomes through ancestral reconstruction of genetic markers. BADGER [83] uses a Bayesian approach to examining the history of gene rearrangement. Count [84] reconstructs the evolution of the size of gene families. EREM [85] analyses the gain and loss of genetic features encoded by binary characters. PARANA [86] performs parsimony based inference of ancestral biological networks that represent gene loss and duplication. Web applications Finally, there are several web-server based applications that allow investigators to use maximum likelihood methods for ancestral reconstruction of different character types without having to install any software. For example, Ancestors[87] is web-server for ancestral genome reconstruction by the identification and arrangement of syntenic regions. FastML [88] is a web-server for probabilistic reconstruction of ancestral sequences by maximum likelihood that uses a gap character model for reconstructing indel variation. MLGO [89] is a web-server for maximum likelihood gene order analysis.“

Reviewer 2: Graham Slater

I reviewed this version:

Joy and colleagues provide a well-written and highly informative description of the motivation behind and methods for reconstructing ancestral character states. The page has a particular focus on discrete states, which is not surprising given their interests, but also touches on continuous and geographic characters. I think this page will be a useful resource and commend them on their efforts. I have a few minor comments but note one particular area that is not discussed.


The authors discuss in the opening paragraph how our ability to accurately reconstruct ancestor states decreases with increasing evolutionary distance between node and tip. For a while, folks working in comparative methods have recognized that incorporating non-contemporaneous information in the form of fossil data or time series data (for example, past population samples in the case of viral data) can improve ancestral reconstruction, particularly where deviations from time homogeneous processes occur. Oakley and Cunningham (2000) first showed that including a fossil taxon in a tree positively affected ancestral state inference -- in their case, accuracy of ancestral estimates remained poor but a trend from large to small plaque diameters that was not detected from extant taxa became apparent. Finarelli and Flynn (2006) and Alberts et al (2009) have since shown that ancestral size estimates for mammalian carnivores and fishes are too large when fossil data are not integrated. Slater et al (2012) showed that using fossil-derived informative priors on node states improves ancestral state inference and model selection from extant taxa only.

***We thank the reviewer for pointing out this oversight. To address this we have included a section discussing calibration and the effects of calibration on the inference of ancestral states. Detailed below:

“Calibration Ancestral reconstruction can be informed by the observed states in historical samples of known age, such as fossils or archival specimens. Since the accuracy of ancestral reconstruction generally decays with increasing time, the use of such specimens provides data that are closer to the ancestors being reconstructed and will most likely improve the analysis, especially when rates of character change vary through time. This concept has been validated by an experimental evolutionary study in which replicate populations of bacteriophage T7 were propagated to generate an artificial phylogeny[35]. In revisiting these experimental data, Oakley and Cunningham [36] found that maximum parsimony methods were unable to accurately reconstruct the known ancestral state of a continuous character (plaque size); these results were verified by computer simulation. This failure of ancestral reconstruction was attributed to a directional bias in the evolution of plaque size (from large to small plaque diameters) that required the inclusion of "fossilized" samples to address. Studies of both mammalian carnivores[37] and fishes[38] have demonstrated that without incorporating fossil data, the reconstructed estimates of ancestral body sizes are unrealistically large. Moreover, Slater et al. [39] showed using caniform carnivorans that incorporating fossil data into prior distributions improved both the Bayesian inference of ancestral states and evolutionary model selection, relative to analyses using only contemporaneous data.”

Minor comments

Methods and Algorithms

Parsimony is an important exception to this paradigm, since the underlying model has no free parameters

Really Parsimony itself isn’t a model, is it? It can be formalized as a model – for example, the No Common Mechanism model of Tuffley and Steel but this has m(2n-3) parameters, where m is the number of characters and n is the number of branches in the tree. Parsimony is really the exception because it works outside the model-based framework, maybe?

*** We thank the reviewer for pointing out this distinction; we have edited the noted passage to clarify this point:

“Parsimony is an important exception to this paradigm: though it has been shown that there are mathematical models for which it is the maximum likelihood estimator [1], at its core it is simply based on the heuristic that changes in character state are rare, without attempting to quantify that rarity.”


although rates may vary over time, the model assumes that it is uniform over the duration of a given branch

I was slightly confused by this line – do the authors mean that rates can vary among time intervals laid down on the tree? Here and further down, the authors discuss how rates cannot vary over branches so the first statement is slightly confusing.

In fact, it should theoretically be possible to allow transition rates to vary over time, and by definition branches, by jointly estimating a branch length scalar (either a purely statistical parameter such as Pagel’s lambda, kappa or delta, or a more model-based parameter, such as a time-dependent exponential rate scalar). It is true to say that such a transformation assumes time-varying rates overall but rates are by definition varying along branches. If the authors are drawing a distinction between time varying rates and rates that ONLY vary along branches (or alternatively how we might attempt to estimate branch specific changes in rate) then I agree this is not done currently and would require integrating sampling from the fossil record (for traits), or population level sampling (for population level genetic studies, either from long term sampling in the case of disease studies or ancient DNA in the case of population genetic studies).

*** Excellent points. The phrasing has been amended to be more precise about the distinction, as well as to acknowledge the possibility of rates that do vary over time:

“In brief, the evolution of a genetic sequence is modelled by a time-reversible continuous time Markov process. In the simplest of these, all characters undergo independent state transitions (such as nucleotide substitutions) at a constant rate over time. This basic model is frequently extended to allow different rates on each branch of the tree. In reality, mutation rates may also vary over time (due, for example, to environmental changes); this can be modelled by allowing the rate parameters to evolve along the tree, at the expense of having an increased number of parameters.”

Continuous Traits

nodes alpha and beta

It may be preferable to use alternative symbols for node names as alpha is a parameter of the OU model and Beta is often used in place of sigma sq to denote rate in BM. Similarly, please use alpha, rather than theta for the OU model's rubber band parameter, as theta is typically used to denote the optimal trait value.

*** Good point – the states are now referred to as U and V, while the rubber band parameter is now denoted by alpha. The changes are viewable at:


Two notes (different from Ross's) with regards the software listed.

First, although there is definitely a clear theme in this piece that focuses on discrete states, continuous data are discussed but there is not column in the table for continuous trait evolution. This would be worth adding for completeness. Ancestral states can be estimated for discrete traits in BEAST2, APE and geiger (assuming Brownian motion), Diversitree (assuming BM or OU) and phytools (assuming BM, OU or a trended random walk). APE, geiger and phytools can all in theory accommodate time varying rates also by applying an appropriate branch length transformation.

***Following the reviewers suggestion we have added an additional column to show discrete vs. continuous character states. Please see: to view the additional column.

Second, the discrete trait reconstructions implemented in ape are actually conditional scaled likelihoods for the descendent subtree, not marginal or joint estimates of the ancestral states. Therefore only the reconstructions at the root node are appropriate estimates of ancestral states. One could re-root the tree at each node to obtain the joint or marginal estimates over the tree, assuming that the evolutionary model used is symmetric. True marginal or joint estimates can be obtained from phytools, diversitree and, I think, geiger. There has been a discussion of this topic on the R-sig-phylo list (

***To address this we have added a caveat note within the new section “Package descriptions” as follows:

“For example, the ape package [75] in the statistical computing environment R also provides methods for ancestral state reconstruction for both discrete and continuous characters through the acefunction, including maximum likelihood. (Note that ace performs reconstruction by computing scaled conditional likelihoods instead of the marginal or joint likelihoods used by other maximum likelihood-based methods for ancestral reconstruction, which may adversely affect the accuracy of reconstruction at nodes other than the root [1].)”


Oakley TH and Cunningham CW (2000) Evolution 54: 397-405.

Finarelli JA and Flynn JJ (2006) Sys Biol 55: 301-313

Albert, JS et al. (2009) Acta Zool. 90:357–384.

Slater GJ, Harmon LJ and Alfaro ME (2012) Evolution 66: 3931-3944.