hmm: implementation of viterbi algorithm (Durbin, 1998) Part 1

January 27, 2012
By

(This article was first published on mintgene » R, and kindly contributed to R-bloggers)

Example in the mentioned book goes as following – dishonest casino uses two types of dice. Fair one that has equal probability of landing on either side (1/6), and the loaded one with 50% chance for getting 6. Your task is to figure out which die has been used (states) just based on the sequence of the outcomes (symbols).

Notice that we convert all probabilities in log scale. Viterbi algorithm selects the most probable path that can have very low value by the end of the run. Thus, conversion to appropriate scale helps avoid calculation issues.

data - http://pastebin.com/NzBE4Fm1


# we'll represent loaded die as "L", and the fair one as "F"
states <- c("F", "L")

# following matrix defines the probability of switching the die
transition.matrix <- t(matrix(data = log2(c(0.95, 0.05, 0.1, 0.9)), nrow = 2, ncol = 2, dimnames = list(c("F", "L"), c("F", "L"))))

# emission probabilities tell you what is the change of landing on each side given that the particular die is selected
emission.matrix <- matrix(data = log2(c(rep(1/6, 6), c(rep(1/10, 5), 1/2))), nrow = 6, ncol = 2, dimnames = list(seq(1,6), c("F", "L")))

# initial probabilities define the chance of starting outcome (in our case we are equally likely to start with either states)
initial.matrix <- matrix(data = log2(c(0.5, 0.5)), nrow = 2, ncol = 1, dimnames = list(c("F", "L"), ""))

After we defined the model, we need to initialize two object that will keep the track of probability history and state path (Pi) during the recursion process.


prob.history  <- data.frame()
state.history <- data.frame()

# we start by calculating the probability of being in particular state given the first symbol and initial matrix
# notice a change in log space - every multiplication is converted to summation
prob.history  <- rbind(prob.history, unlist(lapply(states, function(k) {
    initial.matrix[k, 1] + emission.matrix[symbol.sequence[1], k]
})))

state.history <- rbind(state.history, states)

colnames(prob.history)  <- c("F", "L")
colnames(state.history) <- c("F", "L")

Second part coming soon! mintgene


To leave a comment for the author, please follow the link and comment on his blog: mintgene » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.