steppingStoneSampler - Stepping-stone marginal likelihood estimation
Applies the stepping-stone technique (Fan et al. 2011; Xie et al. 2011) to
estimate the marginal likelihood of a model from a series of unnormalized
"power posterior" densities in which the likelihood term is raised to a power
B between 0 and 1.
steppingStoneSampler(
String filename,
String powerColumnName,
String likelihoodColumnName,
String separator)
| filename : |
String (pass by value) |
| The name of the file where the likelhood samples are stored. |
| powerColumnName : |
String (pass by value) |
| The name of the column of the powers. |
| likelihoodColumnName : |
String (pass by value) |
| The name of the column of the likelihood samples. |
| separator : |
String (pass by value) |
| The field separator character. Values on each line of the file are separated by this character. If sep = "" the separator is 'white space', that is one or more spaces, tabs, newlines or carriage returns. |
For K power posteriors starting with the posterior proper (B_1 = 1) and ending
with the prior (B_K = 0), the stepping-stone marginal likelihood estimate r_SS
is equal to:
r_SS = \prod_{k=1}^K r_{SS,k}
where
r_{SS,k} = (1/n) * L_{max,k}^(B_{k-1} - B_k) * \sum_{i=1}^n (L_{i,k - 1} / L_{max,k})^(B_{k-1} - B_k)
where n denotes the number of MCMC samples from each power posterior,
L_{max,k} denotes the highest likelihood sampled for the k-th power posterior,
and L_{k,i} denotes the likelihood of the i-th MCMC sample from the k-th power
posterior (Xie et al. 2011: 153). If we instead assume that L_{max,k} and
L_{i,k} are given on the log scale, and that r_SS is to be expressed on the
log scale as well, this yields:
r_SS = \sum_{k=1}^K L_{max,k} * (B_k - B_{k-1}) + log( \sum_{i=1}^n exp( (L_{i,k-1} - L_{max,k}) * (B_{k-1} - B_k) ) / n )
which is how the method is implemented in RevBayes. The estimate is returned by
calling the `.marginal()` method on the sampler object. We can also compute the
standard error of this estimate in two different ways using the `.stdError()`
method. By default, the method uses the delta approximation suggested by Xie et
al. (2011: 153):
SE = sqrt( \sum_{k=1}^K 1/r_{SS,k}^2 * Var( r_{SS,k} )/N_{eff,k} )
where N_{eff,k} is the effective sample size for the k-th power posterior. The
approximation may be unreliable if any individual summand exceeds ~0.1, which
may occur when the sample size for a given power posterior is too low. This is
common in analyses employing a large number of relatively short power posterior
runs. In these cases, it may be preferable to derive the standard error using
the stationary bootstrap method (Politis & Romano 1994), which divides the log
likelihood values from each power posterior into consecutive blocks of random
size (drawn from a geometric distribution) and resamples them with replacement.
The RevBayes implementation of the stationary block bootstrap method is based
on the mcmc3r R package (Álvarez-Carretero et al. 2022), and can be accessed by
calling `.stdError(bootstrap=TRUE)`.
# Create a simple model (unclamped)
a ~ dnExponential(1)
mymodel = model(a)
# Create a move vector and a monitor vector
moves[1] = mvScale(a, lambda = 1.0, weight = 1.0)
monitors[1] = mnFile(a, filename = "output/out.log")
# Run 64 power posterior analyses, each with 100 samples
pow_p = powerPosterior(mymodel, monitors, moves, "output/out.pp", cats=64, sampleFreq=1)
pow_p.run(generations=200, burninFraction=0.5)
# Create a stepping-stone sampler object from a file that RevBayes
# automatically generated by concatenating the likelihoods from
# individual power posteriors
ss = steppingStoneSampler(file="output/out.pp", powerColumnName="power",
likelihoodColumnName="likelihood")
# Compute the marginal likelihood
ss.marginal()
# Compute the standard error
ss.stdError() # delta approximation
ss.stdError(bootstrap=TRUE) # bootstrap (100 replicates)
ss.stdError(bootstrap=TRUE, replicates=50, printFiles=TRUE) # bootstrap (50 replicates, save output)