Implementing a function
There are two main classes of functions in RevBayes: regular functions and member functions. For our example, we will be implement a regular function. First we need to add two files to the RevBayes source code, Func_hyperbolicCosineFunction.cpp
and Func_hyperbolicCosineFunction.h
. These will go within revbayes/src/revlanguage/functions/math
since we are adding a function and it is a mathematical function. Second, we need to register the function by modifying revbayes/src/revlanguage/workspace/RbRegister_Func.cpp
.
The file Func_hyperbolicCosine.h
should look like the following:
#ifndef Func_hyperbolicCosine_h
#define Func_hyperbolicCosine_h
#include "Real.h"
#include "RlTypedFunction.h"
#include <string>
namespace RevLanguage {
/**
* The RevLanguage wrapper of the hyperbolic Cosine function (sinh()).
*
* The RevLanguage wrapper of the hyperbolic function function connects
* the variables/parameters of the function and creates the internal HyperbolicCosineFunction object.
* Please read the HyperbolicCosineFunction.h for more info.
*
*
* @copyright Copyright 2024-
* @author The RevBayes Development Core Team (<your-name>)
* @since 2024-04-26, version 1.0
*
*/
class Func_hyperbolicCosine : public TypedFunction<Real> {
public:
// Basic utility functions
Func_hyperbolicCosine* clone(void) const; //!< Clone the object
static const std::string& getClassType(void); //!< Get Rev type
static const TypeSpec& getClassTypeSpec(void); //!< Get class type spec
std::string getFunctionName(void) const; //!< Get the primary name of the function in Rev
const TypeSpec& getTypeSpec(void) const; //!< Get the type spec of the instance
// Function functions you have to override
RevBayesCore::TypedFunction<double>* createFunction(void) const; //!< Create internal function object
const ArgumentRules& getArgumentRules(void) const; //!< Get argument rules
};
}
#endif /* Func_hyperbolicCosine_h */
And the Func_hyperbolicCosine.cpp
should look like this:
#include "Func_hyperbolicCosine.h"
#include "Probability.h"
#include "Real.h"
#include "RlDeterministicNode.h"
#include "TypedDagNode.h"
#include "GenericFunction.h"
using namespace RevLanguage;
double* hyperbolicCosine(double x)
{
double result = (exp(x) + exp(-x))/2;
return new double(result);
}
/**
* The clone function is a convenience function to create proper copies of inherited objected.
* E.g. a.clone() will create a clone of the correct type even if 'a' is of derived type 'b'.
*
* \return A new copy of the function.
*/
Func_hyperbolicCosine* Func_hyperbolicCosine::clone( void ) const
{
return new Func_hyperbolicCosine( *this );
}
RevBayesCore::TypedFunction<double>* Func_hyperbolicCosine::createFunction( void ) const
{
RevBayesCore::TypedDagNode<double>* x = static_cast<const Real&>( this->args[0].getVariable()->getRevObject() ).getDagNode();
return RevBayesCore::generic_function_ptr< double >( hyperbolicCosine, x );
}
/* Get argument rules */
const ArgumentRules& Func_hyperbolicCosine::getArgumentRules( void ) const
{
static ArgumentRules argumentRules = ArgumentRules();
static bool rules_set = false;
if ( !rules_set )
{
argumentRules.push_back( new ArgumentRule( "x", Real::getClassTypeSpec(), "The value.", ArgumentRule::BY_CONSTANT_REFERENCE, ArgumentRule::ANY ) );
rules_set = true;
}
return argumentRules;
}
const std::string& Func_hyperbolicCosine::getClassType(void)
{
static std::string rev_type = "Func_hyperbolicCosine";
return rev_type;
}
/* Get class type spec describing type of object */
const TypeSpec& Func_hyperbolicCosine::getClassTypeSpec(void)
{
static TypeSpec rev_type_spec = TypeSpec( getClassType(), new TypeSpec( Function::getClassTypeSpec() ) );
return rev_type_spec;
}
/**
* Get the primary Rev name for this function.
*/
std::string Func_hyperbolicCosine::getFunctionName( void ) const
{
// create a name variable that is the same for all instance of this class
std::string f_name = "cosh";
return f_name;
}
const TypeSpec& Func_hyperbolicCosine::getTypeSpec( void ) const
{
static TypeSpec type_spec = getClassTypeSpec();
return type_spec;
}
The function RevBayesCore::generic_function_ptr< ResultType >(CppFunction, arg1, ...)
in Func_hyperbolicCosine::createFunction( )
automatically constructs a RevBayesCore::Function
from the C++ function so that we do not have to.
If CppFunction
is has multiple type signatures, as is the case with functions like sqrt( )
, generic_function_ptr< >
will not work.
In such cases, we can solve this problem by writing a wrapper function with a single type signature.
For example,
// Specialize sqrt to only work on doubles
double mysqrt(double x)
{
return sqrt(x);
}
In order to make the new function available for use within the Rev language, we first need to register it. To do this,
src/revlanguage/workspace/
directory.RbRegister_Func.cpp
file in your editor.#include
commands for math functions.#include "Func_hyperbolicCosine.h"
in the correct alphabetical order for that group.addFunction( new Func_hyperbolicCosine() );
The above method should be prefered for implementing new functions in RevBayes unless the function needs to save intermediate results and use them during recalculation.
For example, if our function computes x*(y+z)
, then if only x
has changed, we could save the old value of y+z
and re-use it instead of computing y+z
from scratch. (In this case, y+z
is not expensive enough to be worth saving, but is just a simple illustration of an intermediate result.)
In such cases we will also need to write a RevBayesCore::Function
.
A RevBayesCore::Function
serves a different role than a RevLanguage::Function
.
A RevLanguage::Function
interprets the function arguments and connects them to the RevBayesCore::Function
, which performs the actual calculation and holds the result.
If we write our own RevBayesCore::Function
, then we can use the update()
method to intelligently recalculate the value,
and we can add data memembers to save intermediate results;
Here will illustrate how to write a simple RevBayesCore::Function
using the hyperbolic cosine example.
Our example does’t actually save any intermediate results, it just illustrates how a RevBayesCore::Function
works.
It is therefore equivalent to the RevBayesCore::Function
that is automatically generated by generic_function_ptr
.
First, we will write our new header file. Within our header file, we need to #include
a few other RevBayes header files, including TypedDagNode.h
since our typed function deals with nodes of DAGs. Note that the directory structure of core
is similar to that of the revlanguage
. The file src/core/functions/math/HyperbolicCosineFunction.h
will be:
#ifndef HyperbolicCosineFunction_h
#define HyperbolicCosineFunction_h
#include "TypedFunction.h"
#include "TypedDagNode.h"
#include <cmath>
namespace RevBayesCore {
/**
* \brief Hyperbolic Cosine of a real number.
*
* Compute the hyperbolic cosine of a real number x. (cosh(x) = (exp(x) + exp(-x))/2).
*
* \copyright (c) Copyright 2009-2018 (GPL version 3)
* \author <your-name>
* \since Version 1.0, 2015-01-31
*
*/
class HyperbolicCosineFunction : public TypedFunction<double> {
public:
HyperbolicCosineFunction(const TypedDagNode<double> *a);
HyperbolicCosineFunction* clone(void) const; //!< creates a clone
void update(void); //!< recomputes the value
protected:
void swapParameterInternal(const DagNode *oldP, const DagNode *newP); //!< Implementation of swapping parameters
private:
const TypedDagNode<double>* x;
};
}
#endif
The first part of this file should be the standard header that goes in all the files giving a brief description about what that file is as well as information about the copyright and the author of that file.
Next, after including the necessary header files, we have to ensure that our new function is included within the RevBayesCore
namespace.
Here we are implementing our hyperbolic cosine function as its own class that is derived from the typed function class. This class stores the hyperbolic cosine of a value that is held in a DAG node. We have also defined a clone method which can create a clone of our class, and an update method which will update the value of our Hyperbolic Cosine class whenever the value of the DAG node changes.
The file src/core/functions/math/HyperbolicCosineFunction.cpp
will look like:
#include "HyperbolicCosineFunction.h"
using namespace RevBayesCore;
HyperbolicCosineFunction::HyperbolicCosineFunction(const TypedDagNode<double> *a) : TypedFunction<double>( new double(0.0) ),
x( a )
{
addParameter( a );
}
HyperbolicCosineFunction* HyperbolicCosineFunction::clone( void ) const
{
return new HyperbolicCosineFunction(*this);
}
void HyperbolicCosineFunction::swapParameterInternal(const DagNode *oldP, const DagNode *newP)
{
if (oldP == x)
{
x = static_cast<const TypedDagNode<double>* >( newP );
}
}
void HyperbolicCosineFunction::update( void )
{
// get the current value of x
double xValue = x -> getValue();
// compute the function result
double result = (exp(xValue) + exp(-xValue))/2;
// update the stored value
*value = result;
}
createFunction
In order to use the new RevBayesCore::HyperbolicCosineFunction
class, we need to modify the RevLanguage::Func_hyperbolicCosine
class to use it instead of generic_function_ptr( )
. To do this,
src/revlanguage/functions/math/Func_hyperbolicCosine.cpp
in your editor.#include
statement to make the RevBayesCore::HyperbolicCosineFunction
class visibleFunc_hyperbolicCosine::createFunction()
as follows:...
#include "HyperbolicCosineFunction.h"
...
RevBayesCore::TypedFunction<double>* Func_hyperbolicCosine::createFunction( void ) const
{
RevBayesCore::TypedDagNode<double>* x = static_cast<const Real&>( this->args[0].getVariable()->getRevObject() ).getDagNode();
return new RevBayesCore::HyperbolicCosineFunction( x );
}
Implementing a distribution
Within the RevBayes core directory, there are subdirectories for different categories of distributions.
All mathematical distributions that have been implemented exist in core/distributions/math
.
Note that when implementing a new distribution, you will need to create .cpp
and .h
files in both the revlanguage directory and the core directory. (For a refresher on the difference between these two directories, refer to the Getting familiar with the code section of this Developer’s guide).
The overall naming format remains the same for every distribution in RevBayes. In the Beta Binomial Distribution example provided below, I specify what naming convention to use when creating each file.
It is often helpful to look at/borrow code from existing RevBayes distributions for general help on syntax and organization.
For the language side, one of the most important things is the create distribution function (it converts user-arguments into calculations). Also, the getParameterRules function is important (to get the degrees of freedom & other things). It is often helpful to look at the code of existing distributions for general help on syntax & organization.
Within every new distribution, you will need to include some functions. For example, each new distribution must have: the get class type, name, and help functions. You may not need to implement these from scratch (if they’re dictated by the parent class & are already present), but you will need to implement other functions within your distribution (e.g. cdf, rv, quantile).
Distributions have a prefix DN
(dag node), and all moves have a prefix MV
. RevBayes takes the name within & creates the DN
automatically, so be aware of this. For a refresher on DAG nodes, refer to Getting familiar with the code.
In the following steps, we’ll implement the Beta Binomial Distribution as an example, for syntax purposes.
Create new .cpp & .h files in revlanguage/distributions/math/
, named Dist_betabinomial.cpp
and Dist_betaBinomial.h
.
Note: all files in this directory will follow this naming format: Dist_<nameofdistribution>.[cpp|h]
.
To populate these files, look at existing examples of similar distributions for specific info on what to include & on proper syntax. For example, for the Beta Binomial distribution, I checked the existing Binomial Distribution code for guidance.
//add code here
Create new .cpp
& .h
files in core/distributions/math/
, named BetaBinomialDistribution.cpp
and BetaBinomialDistribution.h
.
Note: This is the object oriented wrapper code that references the functions hard-coded in the next step. All files in this directory will follow this naming format: <NameOfDistribution>Distribution.[cpp|h]
.
Create new .cpp
and .h
files in core/math/Distributions/
, named DistributionBetaBinomial.cpp
and DistributionBetaBinomial.h
.
These are the raw procedural functions in the RevBayes namespace (e.g. pdf, cdf, quantile); they are not derived functions. RbStatistics
is a namespace. To populate these files, look at existing examples of similar distributions to get an idea of what functions to include, what variables are needed, and the proper syntax.
Note: This is the most time-consuming step in the entire process of implementing a new distribution. All files in this directory will follow this naming format: Distribution<NameOfDistribution>.[cpp|h]
Navigate to revlanguage/workspace/RbRegister_Dist.cpp
Every new implementation must be registered in RevBayes. All register files are located in the revlanguage/workspace
directory, and there are different files for the different types of implementations (RbRegister_Func.cpp
is for registering functions; RbRegister_Move
is for registering moves; etc.). We are implementing a distribution, so we will edit the RbRegister_Dist.cpp
file.
You need to have an include statement at the top of the RbRegister
script, to effectively add your distribution to the RevBayes language. You also need to add code to the bottom of this file, and give it a type and a new constructor. Generally, you can look within the file for an idea of proper syntax to use.
For the Beta Binomial distribution, we navigate to the section in the file with the header ‘Distributions’ and then look for the sub-header dealing with ‘math distributions’. Then, add the following line of code:
#include "Dist_betaBinomial.h"
This step registers the header file for the beta binomial distribution, effectively adding it to RevBayes.
Next, navigate to the section of the file that initializes the global workspace. This section defines the workspace class, which houses info on all distributions. Then, add the following line of code:
AddDistribution< Natural >( new Dist_betaBinomial());
This adds the distribution to the workspace. Without this step, the beta binomial distribution will not be added to the revlanguage
.
Note: Depending on the kind of distribution, you may need to change Natural
to a different type (e.g. Probability
, Real
, RealPos
, etc.).
After registering your distribution, you are ready to test your code.
Before pushing your changes, you should ensure your code is working properly.
There are multiple ways to do this. As a best practice, you should first compile it to ensure there are no errors. Once it compiles with no problems, you can test in various ways (e.g. run each individual function within the new Beta Binomial distribution in R, then run the Binomial distribution with a Beta prior in Rev and see if the output matches). For more information, see the developer tutorials on validation and testing. (TODO)
After ensuring your code runs properly, you are ready to add it to the git repo. We recommend reading through the RevBayes Git Workflow section of the Developer’s guide before pushing.
Implementing a Metropolis-Hastings Move
The steps to implementing a new move vary slightly, depending on the move’s type (e.g., Metropolis-Hastings versus Gibbs). For the purpose of this guide, we will focus on a Metropolis-Hastings move.
In general, the fastest and easiest way to get help is to find the most similar move already implemented in RevBayes and use it as a guide. Remember that, as with implementing a new distribution or function, you’ll need to add relevant code to both the core of RevBayes and the language. Also remember that you’ll need to work out the math appropriate for your move (e.g., the Hastings ratio) ahead of time.
Orienting within the repository - For the core, navigate in the repository to src/core/moves
. For a Metropolis-Hastings move, we’ll then go into the proposal
directory. In this directory, you can find several templates for generic proposal classes, as well as subdirectories containing moves for specific parameter types. To keep things easy, we’ll focus on a single scalar parameter, so we’ll navigate one step further into the scalar
directory. For the language, navigate to src/revlanguage/moves
. For this example, as we did in the core, we’ll focus on a move for a scalar parameter, so we’ll then open scalar
. To register our new move after it’s implemented, we’ll also need to update the file src/revlanguage/workspace/RbRegister_Move.cpp
.
Creating new files for the core - As an example, we’ll implement a new move that draws a random value from a Gamma distribution and proposes a new scalar by multiplying the current value by the draw from the Gamma. This move will be called a “Gamma Scaling move”. Since this move is similar to an existing scaling move, we can start by copying the file ScaleProposal.h
and naming the new copy GammaScaleProposal.h
. As a reminder, we’re working in the directory src/core/moves/proposal/scalar/
.
Once the new header file is created and named, we can update the content to match our new move. The simplest changes involve renaming things to match the new move (e.g., updating the preprocessor macro from ScaleProposal_H
to GammaScaleProposal_H
, or changing the name of object references and constructor from ScaleProposal
to GammaScaleProposal
). The comments at the top of the header file that describe how the move works should also be updated, but these changes will obviously be specific to the move being implemented.
Next, we’ll need to create a new .cpp file containing the implementation of our new move. As with the header, it’s easiest to copy and rename an existing file, so we’ll use ScaleProposal.cpp
as our template, copy it, and rename to GammaScaleProposal.cpp
. As with the header file, most of the necessary changes involve updating the names of variables and function names. If the move requires access to other math functions, additional header files may need to be included at the top. Explore src/core/math
as needed to find the necessary functions or distributions. In this case, we will need access to methods associated with the Gamma distribution, so we will add #include "DistributionGamma.h"
. For our example, the number and type of variables used by our move are the same as our template based on the Scale move, so we don’t need to modify the constructor or variable initialization, other than updating the constructor name. Similarly for this example, we don’t need to alter the code in the ::cleanProposal
, ::clone
, ::prepareProposal
, ::printParameterSummary
, ::undoProposal
, ::swapNodeInternal
, and ::tune
methods as these are common to our template and new moves (and will be identical to many of the scalar moves), but we do need to update the class names associated with the methods (i.e., ScaleMove::
-> GammaScaleMove::
). For the ::getProposalName
method, we need to update the string in the method that provides a descriptive name for the move - name = "Gamma Scaling"
. The bulk of the necessary changes for the new move will come in the ::propose
method and the help description above the method. For this example, the new ::propose
method looks like this:
/*
* Perform the proposal.
*
* A gamma scaling proposal draws a random number from a gamma distribution u ~ Gamma(lambda,1) and scales the current vale by u
* lambda is the tuning parameter of the proposal which influences the size of the proposals by changing the shape of the Gamma.
*
* \return The hastings ratio.
*/
double GammaScaleProposal::propose( double &val )
{
// Get random number generator
RandomNumberGenerator* rng = GLOBAL_RNG;
// copy value
storedValue = val;
// Generate new value (no reflection, so we simply abort later if we propose value here outside of support)
double u = RbStatistics::Gamma::rv(lambda,1,*rng);
val *= u;
// compute the Hastings ratio
double ln_hastings_ratio = 0;
try
{
// compute the Hastings ratio
double forward = RbStatistics::Gamma::lnPdf(lambda, 1, u);
double backward = RbStatistics::Gamma::lnPdf(lambda, 1, (1.0/u));
ln_hastings_ratio = backward - forward - log(u); // The -log(u) term is the Jacobian
}
catch (RbException e)
{
ln_hastings_ratio = RbConstants::Double::neginf;
}
return ln_hastings_ratio;
}
In this case, the Hastings ratio involves the probability density of the forward move (the scaling factor u), the density of the corresponding backward move (the scaling factor $\frac{1}{u}$), and the solution to the Jacobian ($-\frac{1}{u}$).
Creating new files for the rev language - After implementing the detailed machinery to perform the new move in the RevBayes core, you need to modify a few files associated with the Rev language to make it available to users. As a reminder, the first set of these files is found in src/revlanguage/moves
. For our example, we will be focusing specifically on a move for a scalar value, so navigate to the scalar
subdirectory. As with the files in the core, we will copy and rename a header (.h) and implementation (.cpp) file from the Scale move. In this case, our new header file will be called Move_GammaScale.h
and our new implementation file will be called Move_GammaScale.cpp
. In the header file, simply update the names of the preprocessor macro, the class, and the objects. Also remember to update the help comments near the top of the file.
In the new .cpp file, begin by updating the names of the header files included at the top. Note that we include, and will need to update, the names of both the header for the core GammaScaleProposal.h
and the workspace Move_GammaScale.h
. Most of the rest of the changes in this file involve updating the names of classes and objects associated with this move, but we also need to update the string specifying the type in the ::getClassType
method and specifying the constructor name in the ::getMoveName
method. Pay special attention to the rules specified in ::getParameterRules
and make sure they satisfy the constraints required by the new move. Update these rules as needed, using existing rules from other moves as templates.
For our particular example, we also need to make one additional change. Because we’ve only updated the ScaleProposal
, and not the ScaleProposalContinuous
, we need to remove the part of the code in this move that could call ScaleProposalContinuous
. The ::constructInternalObject
method now looks like this (compare to the corresponding method from Move_Scale.cpp
):
void Move_GammaScale::constructInternalObject( void )
{
// we free the memory first
delete value;
RevBayesCore::Proposal *p = NULL;
// now allocate a new sliding move
double d = static_cast<const RealPos &>( lambda->getRevObject() ).getValue();
double w = static_cast<const RealPos &>( weight->getRevObject() ).getValue();
double r = static_cast<const Probability &>( tuneTarget->getRevObject() ).getValue();
RevBayesCore::TypedDagNode<double>* tmp = static_cast<const RealPos &>( x->getRevObject() ).getDagNode();
RevBayesCore::StochasticNode<double> *n = dynamic_cast<RevBayesCore::StochasticNode<double> *>( tmp );
p = new RevBayesCore::GammaScaleProposal(n, d, r);
bool t = static_cast<const RlBoolean &>( tune->getRevObject() ).getValue();
value = new RevBayesCore::MetropolisHastingsMove(p, w, t);
}
Updating the Rev language register - The last step in implementing our new move is to make sure that it’s registered in the Rev language. To do this, we will need to update the file RbRegister_Move.cpp
in src/revlanguage/workspace
. In this file, we’ll need to include the header for our new move #include "Move_GammaScale.h"
in the section corresponding to moves on real values. We’ll also need to add the constructor to the workspace by updating the ::initializeMoveGlobalWorkspace
method to include addTypeWithConstructor( new Move_GammaScale() );
, again in the section corresponding to moves on real values.
Testing the performance of the new move - If properly implemented, a new move can be validated by running an MCMC analysis where the clamped data are ignored and one tries to sample only the prior. This can be done in RevBayes by calling model.ignoreAllData()
to mark the data as ignored. The recommended strategy is to implement the simplest possible model that uses a variable of the type appropriate for the new move, and assigning that variable an easily validated prior (e.g., a uniform). Run the analysis with only the new move operating on that variable and then plot the variable’s marginal distribution to make sure it matches the prior.