The MDP2
package in R is a package for solving Markov
decision processes (MDPs) with discrete time-steps, states and actions.
Both traditional MDPs (Puterman 1994),
semi-Markov decision processes (semi-MDPs) (Tijms
2003) and hierarchical-MDPs (HMDPs) (Kristensen and Jørgensen 2000) can be solved
under a finite and infinite time-horizon.
Building and solving an MDP is done in two steps. First, the MDP is built and saved in a set of binary files. Next, you load the MDP into memory from the binary files and apply various algorithms to the model.
The package implement well-known algorithms such as policy iteration
and value iteration under different criteria e.g. average reward per
time unit and expected total discounted reward. The model is stored
using an underlying data structure based on the state-expanded
directed hypergraph of the MDP (Nielsen and
Kristensen (2006)) implemented in C++
for fast
running times.
To illustrate the package capabilities have a look at the vignettes:
- Building MDP models
vignette("building")
. - Solving an infinite-horizon MDP (semi-MDP)
vignette("infinite-mdp")
. - Solving a finite-horizon MDP (semi-MDP)
vignette("finite-mdp")
. - Solving an infinite-horizon HMDP
vignette("infinite-hmdp")
.