Rev Language Reference


dnPhyloCTMC - Distribution of a phylogenetic continuous-time Markov chain

Gives the probability distribution of the character state vectors at the leaves of a phylogenetic tree, given a phylogenetic continuous-time Markov chain model.

Usage

dnPhyloCTMC(Tree tree, RateGenerator Q, Simplex rootFrequencies, RealPos branchRates, Simplex siteMatrices, RealPos[] siteRates, Simplex siteRatesProbs, Probability pInv, Probability observationErrorProbability, Simplex observationErrorFrequencies, Natural nSites, String type, Bool treatAmbiguousAsGap, String coding, Bool storeInternalNodes, Bool gapMatchClamped)

Arguments

tree : Tree (pass by const reference)
The tree along which the process evolves.
Q : RateGenerator (pass by const reference)
The global, branch-specific or site-mixture rate matrices.
rootFrequencies : Simplex (pass by const reference)
The root specific frequencies of the characters, if applicable.
Default : NULL
branchRates : RealPos (pass by const reference)
The global or branch-specific rate multipliers.
Default : 1
siteMatrices : Simplex (pass by const reference)
Simplex of site matrix mixture probabilities. Treats Q as vector of site mixture categories instead of branch-specific matrices.
Default : NULL
siteRates : RealPos[] (pass by const reference)
The rate categories for the sites.
Default : NULL
siteRatesProbs : Simplex (pass by const reference)
The probability weights of rate categories for the sites.
Default : NULL
pInv : Probability (pass by const reference)
The probability of a site being invariant.
Default : NULL
observationErrorProbability : Probability (pass by const reference)
The observational error probability.
Default : NULL
observationErrorFrequencies : Simplex (pass by const reference)
The observational error frequencies of the states once there was an observation error.
Default : NULL
nSites : Natural (pass by value)
The number of sites, used for simulation.
Default : 0
type : String (pass by value)
The data type, used for simulation and initialization.
Default : DNA
Options : DNA|RNA|AA|Codon|Doublet|PoMo|Protein|Standard|NaturalNumbers|Binary|Restriction
treatAmbiguousAsGap : Bool (pass by value)
Should we treat ambiguous characters as gaps/missing?
Default : FALSE
coding : String (pass by value)
What character patterns have been sampled?
Default : all
Options : all|noabsencesites|nopresencesites|informative|variable|nosingletonpresence|nosingletonabsence|nosingletons
storeInternalNodes : Bool (pass by value)
Should we store internal node states in the character matrix?
Default : FALSE
gapMatchClamped : Bool (pass by value)
Should we set the simulated character to be gap or missing if the corresponding character in the clamped matrix is gap or missing?
Default : TRUE

Domain Type

Details

The parameters of a phylogenetic model -- a tree topology with branch lengths, a substitution model that describes how observations evolve over the tree, etc. -- collectively form a distribution called the _phylogenetic continuous-time Markov chain_. The likelihood of observed character state vectors (specified via clamping the distribution to a `AbstractHomologousDiscreteCharacterData` object) is computed using Felsenstein's pruning algorithm, with partial likelihoods stored for each branch of the tree. It is automatically outputted in the `Likelihood` column of the `mnFile()` and `mnScreen()` monitors (which can be suppressed with `likelihood = FALSE`). Optionally, an observation error model can be applied to account for scoring ambiguity (e.g., in morphological datasets). This distinguishes between the true biological state and the recorded score. When `observationErrorProbability` (epsilon) is > 0, the tip likelihoods are initialized as a mixture: with probability (1 - epsilon), the score is accurate; with probability epsilon, the score is drawn from the distribution defined by `observationErrorFrequencies` For more details, see the tutorials on [graphical models](https://revbayes.github.io/tutorials/intro/graph_models) and on [specifying a phylogenetic continuous-time Markov chain](https://revbayes.github.io/tutorials/ctmc/) model.

Example

# Read character data from a file
chars <- readDiscreteCharacterData("myData.nex")
taxa = chars.taxa()

# Draw a tree with branch lengths
tree ~ dnUniformTopologyBranchLength( taxa, branchLengthDistribution=dnExp(10.0) )

# Define a rate matrix
q_matrix <- fnJC(4)

# Create stochastic node with the tip distribution given by `tree` and `q_matrix`
x ~ dnPhyloCTMC(tree = tree, Q = q_matrix)

# Clamp observed characters to the node
x.clamp(chars)

# Calculate the probability of the observed characters under the given distribution
x.lnProbability()

# Simulate characters
sim ~ dnPhyloCTMC(tree = tree, Q = q_matrix, nSites = 24)

# Print simulated characters to screen
sim.show()

# Write dataset to file
writeNexus("simulatedData.nex", sim)