Stepwise Bayesian inference of phylogeny

A full analysis pipeline for phylogenetic inference using RevBayes

Sebastian Höhna and Allison Hsiang

Last modified on January 18, 2021

Overview

This tutorial aims to guide you through our stepwise Bayesian inference pipeline using RevBayes. The exercises are based on a dataset of north american fireflies (genus Photinus) for which we have molecular sequence data for extant species. The material used in this tutorial is directly taken from three others that explore some of the topics in more detail.

In exercises 1 we’ll use the molecular sequence data to infer the relationships among living species. In exercise 2 we transform the branch lengths from units of substitution into units of time assuming a relaxed clock model.

The data

Create a directory on your computer for this tutorial. In this directory, create a subdirectory called data, and download the data files that you can find on the left of this page.

In the data folder, you should now have seven files. Each file is a fasta formatted alignment of a different protein coding gene.

Scripts

For more complex models and analyses, it’s useful to create separate Rev scripts that contain all the model parameters, moves, and functions for different model components (e.g. the substitution model and the clock model).

Create another subdirectory called scripts.

In this tutorial, you will work primarily in your text editor and create a set of modular files that can be easily managed and interchanged. Examples of all the commands used to perform each analysis are also provided at the top of this page under Scripts but try to write the complete scripts yourself from the beginning to ensure you understand all the steps involved and the differences between setting up each analysis.

Exercises

Click on the first exercise to begin!

  1. Estimating unrooted gene tree(s)
  2. Rooting and time calibrating the gene tree(s)