Implementing functions, distributions, and moves

Last modified on August 15, 2018

Implementing a function

There are two main classes of functions in RevBayes: member functions and typed functions. Member functions are functions used inside of deterministic nodes and allow access to member methods of a member object. Typed functions are either values within directed acyclic graph (DAG) nodes (i.e. random variables of some distribution), or are associated with a deterministic node. All deterministic nodes hold a function, the value of these deterministic nodes are returned by a call to that function. This has the advantage of simply modifying the value instead of creating a new object.

For our example implementation we will be implementing a typed function. We will begin with a simple example of implementing a mathematical function, the hyperbolic cosine function. First we need to add two files to the RevBayes source code, a HyperbolicCosineFunction.cpp and a HyperbolicCosineFunction.h. These will go within revbayes/src/core/functions/math since we are adding a function and it is a mathematical function.

First, we will write our header file. Within our header file, we need to include a few other RevBayes header files, including ContinousFunction.h since hyperbolic cosine is a continuous function, and TypedDagNode.h since our typed function deals with nodes of DAGs.

#ifndef HyperbolicCosineFunction_h 
#define HyperbolicCosineFunction_h

#include "ContinuousFunction.h"
#include "TypedDagNode.h"
#include <cmath>

namespace RevBayesCore {
    /**
     * \brief Hyperbolic Cosine of a real number.
     *
     * Compute the hyperbolic cosine of a real number x. (cosh(x) = (exp(x) + exp(-x))/2).
     *
     * \copyright (c) Copyright 2009-2018 (GPL version 3)
     * \author <your-name>
     * \since Version 1.0, 2015-01-31
     *
     */
    class HyperbolicCosineFunction : public ContinuousFunction {

        public:
                                          HyperbolicCosineFunction(const TypedDagNode<double> *a); 

            HyperbolicCosineFunction*     clone(void) const; //!< creates a clone
            void                          update(void);      //!< recomputes the value

        protected:
            void                          swapParameterInternal(const DagNode *oldP, const DagNode *newP); //!< Implementation of swapping parameters

        private:
            const TypedDagNode<double>* x;

    };
}


#endif

The first part of this file should be the standard header that goes in all the files giving a brief description about what that file is as well as information about the copyright and the author of that file. Next, after including the necessary header files, we have to ensure that our new function is included within the RevBayesCore namespace. Here we are implementing our hyperbolic cosine function as its own class that is derived from the continuous function class that is derived from the typed function class. This class stores the hyperbolic cosine of a value that is held in a DAG node. We have also defined a clone method which can create a clone of our class, and an update method which will update the value of our Hyperbolic Cosine class whenever the value of the DAG node changes. Now we will move on to the HyperbolicCosineFunction.cpp file.

#include "HyperbolicCosineFunction.h"

using namespace RevBayesCore;

HyperbolicCosineFunction::HyperbolicCosineFunction(const TypedDagNode<double> *a) : ContinuousFunction( new double(0.0) ),
x( a )
{
    addParameter( a );
}


HyperbolicCosineFunction* HyperbolicCosineFunction::clone( void ) const
{
    return new HyperbolicCosineFunction(*this);
}


void HyperbolicCosineFunction::swapParameterInternal(const DagNode *oldP, const DagNode *newP)
{
    
    if (oldP == x)
    {
        x = static_cast<const TypedDagNode<double>* >( newP );
    }
    
}

void HyperbolicCosineFunction::update( void )
{
    
    // recompute and set the internal value
    double myValue = x -> getValue();
    double x1 = exp(myValue) + exp(-myValue);
    *value = 0.5 * x1;

}

Now all we need to do is add the hyperbolic cosine function to the revlanguage side of things so that when we are using Rev we can use our function. First we need to add Func_hyperbolicCosine.cpp and Func_hyperbolicCosine.h to /src/revlanguage/functions/math/. Note that the directory structure of revlanguage is similar to that of the core. The Revlanguage side serves as a wrapper of the function that we just wrote in the core.

The Func_hyperbolicCosine.h should look like the following:


#ifndef Func_hyperbolicCosine_h
#define Func_hyperbolicCosine_h

#include "Real.h"
#include "RlTypedFunction.h"

#include <string>

namespace RevLanguage {
    
    /**
     * The RevLanguage wrapper of the hyperbolic Cosine function (sinh()).
     *
     * The RevLanguage wrapper of the hyperbolic function function connects
     * the variables/parameters of the function and creates the internal HyperbolicCosineFunction object.
     * Please read the HyperbolicCosineFunction.h for more info.
     *
     *
     * @copyright Copyright 2009-
     * @author The RevBayes Development Core Team (<your-name>)
     * @since 2016-09-26, version 1.0
     *
     */
    class Func_hyperbolicCosine :  public TypedFunction<Real> {
        
    public:
        Func_hyperbolicCosine( void );
        
        // Basic utility functions
        Func_hyperbolicCosine*                         clone(void) const;                                          //!< Clone the object
        static const std::string&                       getClassType(void);                                         //!< Get Rev type
        static const TypeSpec&                          getClassTypeSpec(void);                                     //!< Get class type spec
        std::string                                     getFunctionName(void) const;                                //!< Get the primary name of the function in Rev
        const TypeSpec&                                 getTypeSpec(void) const;                                    //!< Get the type spec of the instance
        
        // Function functions you have to override
        RevBayesCore::TypedFunction<double>*            createFunction(void) const;                                 //!< Create internal function object
        const ArgumentRules&                            getArgumentRules(void) const;                               //!< Get argument rules
        
    };
    
}


#endif /* Func_hyperbolicCosine_h */

And the Func_hyperbolicCosine.cpp should look like this:

#include "Func_hyperbolicCosine.h"
#include "HyperbolicCosineFunction.h"
#include "Probability.h"
#include "Real.h"
#include "RlDeterministicNode.h"
#include "TypedDagNode.h"

using namespace RevLanguage;

/** default constructor */
Func_hyperbolicCosine::Func_hyperbolicCosine( void ) : TypedFunction<Real>( )
{
    
}


/**
 * The clone function is a convenience function to create proper copies of inherited objected.
 * E.g. a.clone() will create a clone of the correct type even if 'a' is of derived type 'b'.
 *
 * \return A new copy of the process.
 */
Func_hyperbolicCosine* Func_hyperbolicCosine::clone( void ) const
{
    
    return new Func_hyperbolicCosine( *this );
}


RevBayesCore::TypedFunction<double>* Func_hyperbolicCosine::createFunction( void ) const
{
    
    RevBayesCore::TypedDagNode<double>* x = static_cast<const Real&>( this->args[0].getVariable()->getRevObject() ).getDagNode();
    RevBayesCore::HyperbolicCosineFunction* f = new RevBayesCore::HyperbolicCosineFunction( x );
    
    return f;
}


/* Get argument rules */
const ArgumentRules& Func_hyperbolicCosine::getArgumentRules( void ) const
{
    
    static ArgumentRules argumentRules = ArgumentRules();
    static bool          rules_set = false;
    
    if ( !rules_set )
    {
        
        argumentRules.push_back( new ArgumentRule( "x", Real::getClassTypeSpec(), "The value.", ArgumentRule::BY_CONSTANT_REFERENCE, ArgumentRule::ANY ) );
        
        rules_set = true;
    }
    
    return argumentRules;
}


const std::string& Func_hyperbolicCosine::getClassType(void)
{
    
    static std::string rev_type = "Func_hyperbolicCosine";
    
    return rev_type;
}

/* Get class type spec describing type of object */
const TypeSpec& Func_hyperbolicCosine::getClassTypeSpec(void)
{
    
    static TypeSpec rev_type_spec = TypeSpec( getClassType(), new TypeSpec( Function::getClassTypeSpec() ) );
    
    return rev_type_spec;
}


/**
 * Get the primary Rev name for this function.
 */
std::string Func_hyperbolicCosine::getFunctionName( void ) const
{
    // create a name variable that is the same for all instance of this class
    std::string f_name = "cosh";
    
    return f_name;
}


const TypeSpec& Func_hyperbolicCosine::getTypeSpec( void ) const
{
    
    static TypeSpec type_spec = getClassTypeSpec();
    
    return type_spec;
}

Finally, we need to include the hyperbolic cosine function in the RbRegister_Func.cpp file located in the /revlanguage/workspace/ directory. To do this go to the RbRegister_Func.cpp file and locate the math functions and type #include Func_hyperbolicCosine.h in the correct alphabetical order for that group. Now scroll down in that file until you find the math functions and add the line addFunction( new Func_hyperbolicCosine());

Implementing a distribution

General info before getting started

For the language side, one of the most important things is the create distribution function (it converts user-arguments into calculations). Also, the getParameterRules function is important (to get the degrees of freedom & other things). It is often helpful to look at the code of existing distributions for general help on syntax & organization.

In the following steps, we’ll implement the Beta Binomial Distribution as an example, for syntax purposes.

Steps

  1. Create new .cpp & .h files in revlanguage/distributions/math/, named Dist_betabinomial.cpp and Dist_betaBinomial.h. Note: all files in this directory will follow this naming format: Dist_<nameofdistribution>.[cpp|h].

    To populate these files, look at existing examples of similar distributions for specific info on what to include & on proper syntax. For example, for the Beta Binomial distribution, I checked the existing Binomial Distribution code for guidance.

    //add code here
    
  2. Test
    1. Create new .cpp & .h files in core/distributions/math/, named BetaBinomialDistribution.cpp and BetaBinomialDistribution.h.

      Note: This is the object oriented wrapper code that references the functions hard-coded in the next step. All files in this directory will follow this naming format: <NameOfDistribution>Distribution.[cpp|h].

    2. Create new .cpp and .h files in core/math/Distributions/, named DistributionBetaBinomial.cpp and DistributionBetaBinomial.h.

      These are the raw procedural functions in the RevBayes namespace (e.g. pdf, cdf, quantile); they are not derived functions. RbStatistics is a namespace. To populate these files, look at existing examples of similar distributions to get an idea of what functions to include, what variables are needed, and the proper syntax.

      Note: This is the most time-consuming step in the entire process of implementing a new distribution. All files in this directory will follow this naming format: Distribution<NameOfDistribution>.[cpp|h]

  3. Navigate to revlanguage/workspace/RbRegister_Dist.cpp

    Every new implementation must be registered in RevBayes. All register files are located in the revlanguage/workspace directory, and there are different files for the different types of implementations (RbRegister_Func.cpp is for registering functions; RbRegister_Move is for registering moves; etc.). We are implementing a distribution, so we will edit the RbRegister_Dist.cpp file.

  4. You need to have an include statement at the top of the RbRegister script, to effectively add your distribution to the RevBayes language. You also need to add code to the bottom of this file, and give it a type and a new constructor. Generally, you can look within the file for an idea of proper syntax to use.

    For the Beta Binomial distribution, we navigate to the section in the file with the header ‘Distributions’ and then look for the sub-header dealing with ‘math distributions’. Then, add the following line of code:

    #include "Dist_betaBinomial.h"
    

    This step registers the header file for the beta binomial distribution, effectively adding it to RevBayes.

    Next, navigate to the section of the file that initializes the global workspace. This section defines the workspace class, which houses info on all distributions. Then, add the following line of code:

    AddDistribution< Natural		>( new Dist_betaBinomial());
    

    This adds the distribution to the workspace. Without this step, the beta binomial distribution will not be added to the revlanguage.

    Note: Depending on the kind of distribution, you may need to change Natural to a different type (e.g. Probability, Real, RealPos, etc.).

    After registering your distribution, you are ready to test your code.

  5. Before pushing your changes, you should ensure your code is working properly.

    There are multiple ways to do this. As a best practice, you should first compile it to ensure there are no errors. Once it compiles with no problems, you can test in various ways (e.g. run each individual function within the new Beta Binomial distribution in R, then run the Binomial distribution with a Beta prior in Rev and see if the output matches). For more information, see the developer tutorials on validation and testing. (TODO)

    After ensuring your code runs properly, you are ready to add it to the git repo. We recommend reading through the Git Workflow Best Practices section of the Developer’s guide before pushing.

Implementing a Metropolis-Hastings Move

General info before getting started

The steps to implementing a new move vary slightly, depending on the move’s type (e.g., Metropolis-Hastings versus Gibbs). For the purpose of this guide, we will focus on a Metropolis-Hastings move.

In general, the fastest and easiest way to get help is to find the most similar move already implemented in RevBayes and use it as a guide. Remember that, as with implementing a new distribution or function, you’ll need to add relevant code to both the core of RevBayes and the language. Also remember that you’ll need to work out the math appropriate for your move (e.g., the Hastings ratio) ahead of time.

Steps

  1. Orienting within the repository - For the core, navigate in the repository to src/core/moves. For a Metropolis-Hastings move, we’ll then go into the proposal directory. In this directory, you can find several templates for generic proposal classes, as well as subdirectories containing moves for specific parameter types. To keep things easy, we’ll focus on a single scalar parameter, so we’ll navigate one step further into the scalar directory. For the language, navigate to src/revlanguage/moves. For this example, as we did in the core, we’ll focus on a move for a scalar parameter, so we’ll then open scalar. To register our new move after it’s implemented, we’ll also need to update the file src/revlanguage/workspace/RbRegister_Move.cpp.

  2. Creating new files for the core - As an example, we’ll implement a new move that draws a random value from a Gamma distribution and proposes a new scalar by multiplying the current value by the draw from the Gamma. This move will be called a “Gamma Scaling move”. Since this move is similar to an existing scaling move, we can start by copying the file ScaleProposal.h and naming the new copy GammaScaleProposal.h. As a reminder, we’re working in the directory src/core/moves/proposal/scalar/.

    Once the new header file is created and named, we can update the content to match our new move. The simplest changes involve renaming things to match the new move (e.g., updating the preprocessor macro from ScaleProposal_H to GammaScaleProposal_H, or changing the name of object references and constructor from ScaleProposal to GammaScaleProposal). The comments at the top of the header file that describe how the move works should also be updated, but these changes will obviously be specific to the move being implemented.

    Next, we’ll need to create a new .cpp file containing the implementation of our new move. As with the header, it’s easiest to copy and rename an existing file, so we’ll use ScaleProposal.cpp as our template, copy it, and rename to GammaScaleProposal.cpp. As with the header file, most of the necessary changes involve updating the names of variables and function names. If the move requires access to other math functions, additional header files may need to be included at the top. Explore src/core/math as needed to find the necessary functions or distributions. In this case, we will need access to methods associated with the Gamma distribution, so we will add #include "DistributionGamma.h". For our example, the number and type of variables used by our move are the same as our template based on the Scale move, so we don’t need to modify the constructor or variable initialization, other than updating the constructor name. Similarly for this example, we don’t need to alter the code in the ::cleanProposal, ::clone, ::prepareProposal, ::printParameterSummary, ::undoProposal, ::swapNodeInternal, and ::tune methods as these are common to our template and new moves (and will be identical to many of the scalar moves), but we do need to update the class names associated with the methods (i.e., ScaleMove:: -> GammaScaleMove::). For the ::getProposalName method, we need to update the string in the method that provides a descriptive name for the move - name = "Gamma Scaling". The bulk of the necessary changes for the new move will come in the ::propose method and the help description above the method. For this example, the new ::propose method looks like this:

    /*
     * Perform the proposal.
     *
     * A gamma scaling proposal draws a random number from a gamma distribution u ~ Gamma(lambda,1) and scales the current vale by u
     * lambda is the tuning parameter of the proposal which influences the size of the proposals by changing the shape of the Gamma.
     *
     * \return The hastings ratio.
     */
    double GammaScaleProposal::propose( double &val )
    {
        
        // Get random number generator
        RandomNumberGenerator* rng     = GLOBAL_RNG;
        
        // copy value
        storedValue = val;
        
        // Generate new value (no reflection, so we simply abort later if we propose value here outside of support)
        double u = RbStatistics::Gamma::rv(lambda,1,*rng);
        val *= u;
        
        // compute the Hastings ratio
        double ln_hastings_ratio = 0;
        try
        {
            // compute the Hastings ratio
            double forward = RbStatistics::Gamma::lnPdf(lambda, 1, u);
            double backward = RbStatistics::Gamma::lnPdf(lambda, 1, (1.0/u));
            
            ln_hastings_ratio = backward - forward - log(u);    // The -log(u) term is the Jacobian
        }
        catch (RbException e)
        {
            ln_hastings_ratio = RbConstants::Double::neginf;
        }
    
        return ln_hastings_ratio;
    }
    

    In this case, the Hastings ratio involves the probability density of the forward move (the scaling factor u), the density of the corresponding backward move (the scaling factor $\frac{1}{u}$), and the solution to the Jacobian ($-\frac{1}{u}$).

  3. Creating new files for the rev language - After implementing the detailed machinery to perform the new move in the RevBayes core, you need to modify a few files associated with the Rev language to make it available to users. As a reminder, the first set of these files is found in src/revlanguage/moves. For our example, we will be focusing specifically on a move for a scalar value, so navigate to the scalar subdirectory. As with the files in the core, we will copy and rename a header (.h) and implementation (.cpp) file from the Scale move. In this case, our new header file will be called Move_GammaScale.h and our new implementation file will be called Move_GammaScale.cpp. In the header file, simply update the names of the preprocessor macro, the class, and the objects. Also remember to update the help comments near the top of the file.

    In the new .cpp file, begin by updating the names of the header files included at the top. Note that we include, and will need to update, the names of both the header for the core GammaScaleProposal.h and the workspace Move_GammaScale.h. Most of the rest of the changes in this file involve updating the names of classes and objects associated with this move, but we also need to update the string specifying the type in the ::getClassType method and specifying the constructor name in the ::getMoveName method. Pay special attention to the rules specified in ::getParameterRules and make sure they satisfy the constraints required by the new move. Update these rules as needed, using existing rules from other moves as templates.

    For our particular example, we also need to make one additional change. Because we’ve only updated the ScaleProposal, and not the ScaleProposalContinuous, we need to remove the part of the code in this move that could call ScaleProposalContinuous. The ::constructInternalObject method now looks like this (compare to the corresponding method from Move_Scale.cpp):

    void Move_GammaScale::constructInternalObject( void )
    {
        // we free the memory first
        delete value;
        
        RevBayesCore::Proposal *p = NULL;
        
        // now allocate a new sliding move
        double d = static_cast<const RealPos &>( lambda->getRevObject() ).getValue();
        double w = static_cast<const RealPos &>( weight->getRevObject() ).getValue();
        double r = static_cast<const Probability &>( tuneTarget->getRevObject() ).getValue();
        RevBayesCore::TypedDagNode<double>* tmp = static_cast<const RealPos &>( x->getRevObject() ).getDagNode();
        RevBayesCore::StochasticNode<double> *n = dynamic_cast<RevBayesCore::StochasticNode<double> *>( tmp );
        p = new RevBayesCore::GammaScaleProposal(n, d, r);
        bool t = static_cast<const RlBoolean &>( tune->getRevObject() ).getValue();
        
        value = new RevBayesCore::MetropolisHastingsMove(p, w, t);
        
    }
    
  4. Updating the Rev language register - The last step in implementing our new move is to make sure that it’s registered in the Rev language. To do this, we will need to update the file RbRegister_Move.cpp in src/revlanguage/workspace. In this file, we’ll need to include the header for our new move #include "Move_GammaScale.h" in the section corresponding to moves on real values. We’ll also need to add the constructor to the workspace by updating the ::initializeMoveGlobalWorkspace method to include addTypeWithConstructor( new Move_GammaScale() );, again in the section corresponding to moves on real values.

  5. Testing the performance of the new move - If properly implemented, a new move can be validated by running an MCMC analysis where the data are ignored and one tries to sample only the prior. This can be done in RevBayes by adding the underPrior=TRUE option as an argument to the .run() method of an mcmc object. The recommended strategy is to implement the simplest possible model that uses a variable of the type appropriate for the new move, and assigning that variable an easily validated prior (e.g., a uniform). Run the analysis with only the new move operating on that variable and then plot the variable’s marginal distribution to make sure it matches the prior.

Implementing a help entry

Adding basic help sections

For each new function, distrubution or move you add in RevBayes you should always create a corresponding help entry.

As an example we will look at the documentation associated with the normal distributon function. The functions used to generate the help entry are in the Dist_norm.cpp and Dist_norm.h files in the directory /revlanguage/distributions/math.

The basic help fields and assocaited functions are listed in the table below and are output to screen in the same order.

Help section Association RevBayes function
Title getHelpTitle
Aliases getDistributionFunctionAliases
  getFunctionNameAliases
  getMoveAliases
Description getHelpDescription
Constructor (autogenerated)
Return Type (autogenerated; only Functions, Distributions)
Details getHelpDetails
Example getHelpExample
Methods (autogenerated)
Authors getHelpAuthor
References getHelpReferences
See Also getHelpSeeAlso
Help fields and associatd functions

Note the Constructor, Return Type and Methods fields are generated automatically. All the other fields must be populated manually.

The help functions are protected functions included in the header file Dist_norm.h.

protected:
    
  std::vector<std::string>                        getHelpAuthor(void) const;  //!< Get the author(s) of this function
  std::string                                     getHelpDescription(void) const;  //!< Get the description for this function
  std::string                                     getHelpDetails(void) const; //!< Get the more detailed description of the function
  std::string                                     getHelpExample(void) const; //!< Get an executable and instructive example
  std::vector<RevBayesCore::RbHelpReference>      getHelpReferences(void) const; //!< Get some references/citations for this function
  std::vector<std::string>                        getHelpSeeAlso(void) const; //!< Get suggested other functions
  std::string                                     getHelpTitle(void) const; //!< Get the title of this help entry

The information for the helps fields should be filled out in the source file Dist_norm.cpp.

/**
 * Get the author(s) of this function so they can receive credit (and blame) for it.
 */
std::vector<std::string> Dist_norm::getHelpAuthor(void) const
{
    // create a vector of authors for this function
    std::vector<std::string> authors;
    authors.push_back( "Sebastian Hoehna" );

    return authors;
}


/**
 * Get the (brief) description for this function
 */
std::string Dist_norm::getHelpDescription(void) const
{
    // return the description
    return "Normal (gaussian) distribution with mean equal to ‘mean’ and standard deviation equal to ‘sd’.";
}


/**
 * Get the more detailed description of the function
 */
std::string Dist_norm::getHelpDetails(void) const
{
  // create a details variable
    std::string details;
    details += "The normal distribution has density:\n\n";
    details += "f(x) = 1/(sqrt(2 pi) sigma) e^-((x - mu)^2/(2 sigma^2))\n\n";
    details += "where mu is the mean of the distribution and sigma the standard deviation.";

    return details;
}


/**
 * Get an executable and instructive example.
 * These example should help the users to show how this function works but
 * are also used to test if this function still works.
 */
std::string Dist_norm::getHelpExample(void) const
{
    // create an example as a single string variable.
    std::string example = "";

    example += "# we simulate some observations\n";
    example += "x <- rnorm(n=10,mean=5,sd=10)\n";
    example += "# let's see what the minimum is (you could do the max too)\n";
    example += "min(x)\n";
    example += "# let's also see what the mean and the variance are\n";
    example += "mean(x)\n";
    example += "var(x)\n";
    example += "sd(x)\n";

    return example;
}


/**
 * Get some references/citations for this function
 *
 */
std::vector<RevBayesCore::RbHelpReference> Dist_norm::getHelpReferences(void) const
{
    // create an entry for each reference
    std::vector<RevBayesCore::RbHelpReference> references;


    return references;
}


/**
 * Get the names of similar and suggested other functions
 */
std::vector<std::string> Dist_norm::getHelpSeeAlso(void) const
{
    // create an entry for each suggested function
    std::vector<std::string> see_also;
    see_also.push_back( "dnLognormal" );


    return see_also;
}


/**
 * Get the title of this help entry
 */
std::string Dist_norm::getHelpTitle(void) const
{
    // create a title variable
    std::string title = "Normal Distribution";

    return title;
}

Once you recompile RevBayes, be careful to double check the screen output!