Simple Diversification Rate Estimation

Comparing different constant-rate models of lineage diversification

Sebastian Höhna and Tracy Heath

Last modified on December 13, 2018

Estimating Constant Speciation & Extinction Rates


This tutorial describes how to specify basic branching-process models in RevBayes; two variants of the constant-rate birth-death process (Yule 1925; Kendall 1948; Thompson 1975; Nee et al. 1994; Rannala and Yang 1996; Yang and Rannala 1997; Höhna 2015). The probabilistic graphical model is given for each component of this tutorial. After each model is specified, you will estimate speciation and extinction rates using Markov chain Monte Carlo (MCMC). Finally, you will estimate the marginal likelihood of the model and evaluate the relative support using Bayes factors.

You should read first the Introduction to Diversification Rate Estimation tutorial, which explains the theory and gives some general overview of diversification rate estimation.

Pure-Birth (Yule) Model


Before evaluating the relative support for different models, we must first specify them in Rev. In this section, we will walk through specifying a pure-birth process model and estimating the marginal likelihood. The section about the birth-death process will be less detailed because it will build up on this section.

The simplest branching model is the pure-birth process described by Yule (1925). Under this model, we assume at any instant in time, every lineage has the same speciation rate, $\lambda$. In its simplest form, the speciation rate remains constant over time. As a result, the waiting time between speciation events is exponential, where the rate of the exponential distribution is the product of the number of extant lineages ($n$) at that time and the speciation rate: $n\lambda$ (Yule 1925; Aldous 2001; Höhna 2014). The pure-birth branching model does not allow for lineage extinction (i.e., the extinction rate $\mu=0$). However, the model depends on a second parameter, $\rho$, which is the probability of sampling a species in the present time. It also depends on the time of the start of the process, whether that is the origin time or root age. Therefore, the probabilistic graphical model of the pure-birth process is quite simple, where the observed time tree topology and node ages are conditional on the speciation rate, sampling probability, and root age ().

The graphical model representation of the pure-birth (Yule) process. For more information about graphical model representations see Höhna et al. (2014).

We can add hierarchical structure to this model and account for uncertainty in the value of the speciation rate by placing a hyperprior on $\lambda$ (). The graphical models in and demonstrate the simplicity of the Yule model. Ultimately, the pure birth model is just a special case of the birth-death process, where the extinction rate (typically denoted $\mu$) is a constant node with the value 0.

The graphical model representation of the pure-birth (Yule) process, where the speciation rate is treated as a random variable drawn from a lognormal distribution.

For this exercise, we will specify a Yule model, such that the speciation rate is a stochastic node, drawn from a lognormal distribution as in . In a Bayesian framework, we are interested in estimating the posterior probability of $\lambda$ given that we observe a time tree.

In this example, we have a phylogeny of 233 primates. We are treating the time tree $\Psi$ as an observation, thus clamping the model with an observed value. The time tree we are conditioning the process on is taken from the analysis by Magnuson-Ford and Otto (2012). Furthermore, there are approximately 367 described primates species, so we will fix the parameter $\rho$ to $233/367$.

⇨ The full Yule-model specification is in the file called Yule.Rev.

Read the tree


Begin by reading in the observed tree.

T <- readTrees("data/primates_tree.nex")[1]

From this tree, we can get some helpful variables:

taxa <- T.taxa()

Additionally, we can initialize a variable for our vector of moves and monitors:

moves    = VectorMoves()
monitors = VectorMonitors()

Specifying the model


Birth rate


The model we are specifying only has three nodes (). We can specify the birth rate $\lambda$, the mean and standard deviation of the lognormal hyperprior on $\lambda$, and the conditional dependency of the two parameters all in one line of Rev code.

birth_rate_mean <- ln( ln(367/2) / T.rootAge() )
birth_rate_sd <- 0.587405
birth_rate ~ dnLognormal(mean=birth_rate_mean,sd=birth_rate_sd)

Here, the stochastic node called birth_rate represents the speciation rate $\lambda$. birth_rate_mean and birth_rate_sd are the prior mean and prior standard deviation, respectively. We chose the prior mean so that it is centered around observed number of species (i.e., the expected number of species under a Yule process will thus be equal to the observed number of species) and a prior standard deviation of 0.587405 which creates a lognormal distribution with 95% prior probability spanning exactly one order of magnitude. If you want to represent more prior uncertainty by, e.g., allowing for two orders of magnitude in the 95% prior probability then you can simply multiply birth_rate_sd by a factor of 2.

To estimate the value of $\lambda$, we assign a proposal mechanism to operate on this node. In RevBayes these MCMC sampling algorithms are called moves. We need to create a vector of moves and we can do this by using vector indexing and our pre-initialized iterator mi. We will use a scaling move on $\lambda$ called mvScale.

moves.append( mvScale(birth_rate,lambda=1,tune=true,weight=3) )

Sampling probability


Our prior belief is that we have sampled 233 out of 367 living primate species. To account for this we can set the sampling parameter as a constant node with a value of 233/367

rho <- T.ntips()/367

Root age


Any stochastic branching process must be conditioned on a time that represents the start of the process. Typically, this parameter is the origin time and it is assumed that the process started with one lineage. Thus, the origin of a birth-death process is the node that is ancestral to the root node of the tree. For macroevolutionary data, particularly without any sampled fossils, it is difficult to use the origin time. To accommodate this, we can condition on the age of the root by assuming the process started with two lineages that both originate at the time of the root.

We can get the value for the root from the Magnuson-Ford and Otto (2012) tree.

root_time <- T.rootAge()

The time tree


Now we have all of the parameters we need to specify the full pure-birth model. We can initialize the stochastic node representing the time tree. Note that we set the mu parameter to the constant value 0.0.

timetree ~ dnBDP(lambda=birth_rate, mu=0.0, rho=rho, rootAge=root_time, samplingStrategy="uniform", condition="survival", taxa=taxa)

If you refer back to Equation \eqref{eq:bayes_thereom} and , the time tree $\Psi$ is the variable we observe, i.e., the data. We can set this in Rev by using the clamp() function.

timetree.clamp(T)

Here we are fixing the value of the time tree to our observed tree from Magnuson-Ford and Otto (2012).

Finally, we can create a workspace object of our whole model using the model() function. Workspace objects are initialized using the = operator. This distinguishes the objects used by the program to run the MCMC analysis from the distinct nodes of our graphical model. The model workspace objects makes it easy to work with the model in Rev and creates a wrapper around our model DAG. Because our model is a directed, acyclic graph (DAG), we only need to give the model wrapper function a single node and it does the work to find all the other nodes through their connections.

mymodel = model(birth_rate)

The model() function traverses all of the connections and finds all of the nodes we specified.

Running an MCMC analysis


Specifying Monitors


For our MCMC analysis, we need to set up a vector of monitors to record the states of our Markov chain. The monitor functions are all called mn\*, where \* is the wildcard representing the monitor type. First, we will initialize the model monitor using the mnModel function. This creates a new monitor variable that will output the states for all model parameters when passed into a MCMC function.

monitors.append( mnModel(filename="output/primates_Yule.log",printgen=10, separator = TAB) )

Additionally, create a screen monitor that will report the states of specified variables to the screen with mnScreen:

monitors.append( mnScreen(printgen=1000, birth_rate) )

Initializing and Running the MCMC Simulation


With a fully specified model, a set of monitors, and a set of moves, we can now set up the MCMC algorithm that will sample parameter values in proportion to their posterior probability. The mcmc() function will create our MCMC object:

mymcmc = mcmc(mymodel, monitors, moves)

We may wish to run the .burnin() member function, i.e., if we wish to pre-run the chain and discard the initial states. Recall that the .burnin() function specifies a completely separate preliminary MCMC analysis that is used to tune the scale of the moves to improve mixing of the MCMC analysis.

mymcmc.burnin(generations=10000,tuningInterval=200)

Now, run the MCMC:

mymcmc.run(generations=50000)

When the analysis is complete, you will have the monitored files in your output directory.

⇨ The Rev file for performing this analysis: mcmc_Yule.Rev

Exercise 1


Estimates of the posterior and prior distribution of the birth_rate visualized in Tracer. The prior (black curve) shows the lognormal distribution that we chose as the prior distribution.

Estimating the marginal likelihood of the model


With a fully specified model, we can set up the powerPosterior() analysis to create a file of ‘powers’ and likelihoods from which we can estimate the marginal likelihood using stepping-stone or path sampling. This method computes a vector of powers from a beta distribution, then executes an MCMC run for each power step while raising the likelihood to that power. In this implementation, the vector of powers starts with 1, sampling the likelihood close to the posterior and incrementally sampling closer and closer to the prior as the power decreases. For more information on marginal likelihood estimation please read the Bayesian Model Selection Tutorial

First, we create the variable containing the power posterior. This requires us to provide a model and vector of moves, as well as an output file name. The cats argument sets the number of power steps.

pow_p = powerPosterior(mymodel, moves, monitors, "output/Yule_powp.out", cats=127, sampleFreq=10)

We can start the power posterior by first burning in the chain and and discarding the first 10000 states.

pow_p.burnin(generations=10000,tuningInterval=200)

Now execute the run with the .run() function:

pow_p.run(generations=10000)

Once the power posteriors have been saved to file, create a stepping-stone sampler. This function can read any file of power posteriors and compute the marginal likelihood using stepping-stone sampling.

ss = steppingStoneSampler(file="output/Yule_powp.out", powerColumnName="power", likelihoodColumnName="likelihood")

Compute the marginal likelihood under stepping-stone sampling using the member function marginal() of the ss variable and record the value in .

write("Stepping stone marginal likelihood:\t", ss.marginal() )

Path sampling is an alternative to stepping-stone sampling and also takes the same power posteriors as input.

ps = pathSampler(file="output/Yule_powp.out", powerColumnName="power", likelihoodColumnName="likelihood")

Compute the marginal likelihood under stepping-stone sampling using the member function marginal() of the ps variable and record the value in .

write("Path-sampling marginal likelihood:\t", ps.marginal() )

⇨ The Rev file for performing this analysis: ml_Yule.Rev.

Exercise 2


Model Path-Sampling Stepping-Stone-Sampling
Marginal likelihood Yule ($M_0$)    
Marginal likelihood birth-death ($M_1$)    
Supported model?    
Marginal likelihoods of the Yule and Birth-death models.

Birth-Death Process


The pure-birth model does not account for extinction, thus it assumes that every lineage at the start of the process will have sampled descendants at time 0. This assumption is fairly unrealistic for most phylogenetic datasets on a macroevolutionary time scale since the fossil record provides evidence of extinct lineages. Kendall (1948) described a more general branching process model to account for lineage extinction called the birth-death process. Under this model, at any instant in time, every lineage has the same rate of speciation $\lambda$ and the same rate of extinction $\mu$. This is the constant-rate birth-death process, which considers the rates constant over time and over the tree (Nee et al. 1994; Höhna 2015).

Yang and Rannala (1997) derived the probability of time trees under an extension of the birth-death model that accounts for incomplete sampling of the tips () (see also Stadler (2009) and Höhna (2014)). Under this model, the parameter $\rho$ accounts for the probability of sampling in the present time, and because it is a probability, this parameter can only take values between 0 and 1.

The graphical model representation of the birth-death process with uniform sampling and conditioned on the root age.

In principle, we can specify a model with prior distributions on speciation and extinction rates directly. One possibility is to specify an exponential, lognormal, or gamma distribution as the prior on either rate parameter. However, it is more common to specify prior distributions on a transformation of the speciation and extinction rate because, for example, we want to enforce that the speciation rate is always larger than the extinction rate.

The graphical model representation of the birth-death process with uniform sampling parameterized using the diversification and turnover.

In the following subsections we will only provide the key command that are different for the constant-rate birth-death process. All other commands will be the same as in the previous exercise. You should copy the mcmc_Yule.Rev script and modify it accordingly. Don’t forget to rename the filenames of the monitors to avoid overwriting of your previous results!

Diversification and turnover


We have some good prior information about the magnitude of the diversification. The diversification rate represent the rate at which the species diversity increases. Thus, we just use the same prior for the diversification rate as we used before for the birth rate.

diversification_mean <- ln( ln(367.0/2.0) / T.rootAge() )
diversification_sd <- 0.587405
diversification ~ dnLognormal(mean=diversification_mean,sd=diversification_sd)
moves.append( mvScale(diversification,lambda=1.0,tune=true,weight=3.0) )

Unfortunately, we have less prior information about the turnover rate. The turnover rate is the rate at which one species is replaced by another species due to a birth plus death event. Hence, the turnover rate represent the longevity of a species. For simplicity we use the same prior on the turnover rate but with two orders of magnitude prior uncertainty.

turnover_mean <- ln( ln(367.0/2.0) / T.rootAge() )
turnover_sd <- 0.587405*2
turnover ~ dnLognormal(mean=turnover_mean,sd=turnover_sd)
moves.append( mvScale(turnover,lambda=1.0,tune=true,weight=3.0) )

Birth rate and death rate


The birth and death rates are both deterministic nodes. We compute them by simple parameter transformation. Note that the death rate is in fact equal to the turnover rate.

birth_rate := diversification + turnover
death_rate := turnover

All other parameters, such as the sampling probability and the root age are kept the same as in the analysis above.

The time tree


Initialize the stochastic node representing the time tree. The main difference now is that we provide a stochastic parameter for the extinction rate $\mu$.

timetree ~ dnBDP(lambda=birth_rate, mu=death_rate, rho=rho, rootAge=root_time, samplingStrategy="uniform", condition="survival", taxa=taxa)

⇨ The Rev file for performing this analysis: mcmc_BD.Rev

Exercise 3


  1. Aldous D.J. 2001. Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today. Statistical Science.:23–34.
  2. Höhna S. 2014. Likelihood Inference of Non-Constant Diversification Rates with Incomplete Taxon Sampling. PLoS One. 9:e84184. 10.1371/journal.pone.0084184
  3. Höhna S. 2015. The time-dependent reconstructed evolutionary process with a key-role for mass-extinction events. Journal of Theoretical Biology. 380:321–331. http://dx.doi.org/10.1016/j.jtbi.2015.06.005
  4. Höhna S., Heath T.A., Boussau B., Landis M.J., Ronquist F., Huelsenbeck J.P. 2014. Probabilistic Graphical Model Representation in Phylogenetics. Systematic Biology. 63:753–771. 10.1093/sysbio/syu039
  5. Kendall D.G. 1948. On the Generalized "Birth-and-Death" Process. The Annals of Mathematical Statistics. 19:1–15. 10.1214/aoms/1177730285
  6. Magnuson-Ford K., Otto S.P. 2012. Linking the Investigations of Character Evolution and Species Diversification. The American Naturalist. 180:225–245. 10.1086/666649
  7. Nee S., May R.M., Harvey P.H. 1994. The Reconstructed Evolutionary Process. Philosophical Transactions: Biological Sciences. 344:305–311. 10.1098/rstb.1994.0068
  8. Rannala B., Yang Z. 1996. Probability distribution of molecular evolutionary trees: A new method of phylogenetic inference. Journal of Molecular Evolution. 43:304–311. 10.1007/BF02338839
  9. Stadler T. 2009. On incomplete sampling under birth-death models and connections to the sampling-based coalescent. Journal of Theoretical Biology. 261:58–66. 10.1016/j.jtbi.2009.07.018
  10. Thompson E.A. 1975. Human evolutionary trees. Cambridge University Press Cambridge.
  11. Yang Z., Rannala B. 1997. Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method. Molecular Biology and Evolution. 14:717–724. 10.1093/oxfordjournals.molbev.a025811
  12. Yule G.U. 1925. A mathematical theory of evolution, based on the conclusions of Dr. JC Willis, FRS. Philosophical Transactions of the Royal Society of London. Series B, Containing Papers of a Biological Character. 213:21–87.