(This article was first published on

**Shifting sands**, and kindly contributed to R-bloggers)This is an example from Zucchini & MacDonald’s book on Hidden Markov Models for Time Series (exercise 1.3).

The data is annual counts of earthquakes of magnitude 7 or greater, which exhibits both overdispersion for a Poisson (where the mean should equal the variance) as well as serial dependence.

The aim is to fit a mixture of m independent Poisson distributions to this data, using the non-linear minimizer nlm to minimize the negative log-likelihood of the Poisson mixture.

Sounds easy right? I have not done much optimisation stuff so it took me longer than it probably should have. There’s little better for the ego then getting stuck on things by page 10.

Constrained to unconstrained

There are two sets of parameters, the lambda parameters to the Poisson distributions and the delta mixing distribution, the latter giving the proportions of each distribution to mix.

The Poisson parameters must be greater than zero and there are m of them. The mixing distribution values must all sum to one, and the first mixing value is implied by the m-1 subsequent values.

These are the constraints (lambda’s greater than zero, delta’s sum to one), and there are 2m – 1 values (m lambdas, m-1 deltas). We need to transform these to unconstrained values for use with nlm (becoming eta and tau respectively).

The formulas for transformation are reproduced from the book here

These transformed values are combined in to one parameter vector and passed to nlm. We also pass a function that calculates the negative log likelihood.

In the function, we first convert the parameters from the unconstrained “working” form to the constrained “natural” form using the inverse transform, and then use these values to do the likelihood calculation.

Outro

Now that I look over the code and write this out it does seem all fairly straightforward and relatively trivial, but it was new to me and took a while, so I figured other people might find the example useful as well.

The code has three examples for the cases m = 2, 3, and 4. Obviously feel free to mess around with the initial parameter estimates to see what comes up.

It does seem to match the fitted models from the book, which I have copied here:

For all the details, check out the book, chapter 1!

To

**leave a comment**for the author, please follow the link and comment on their blog:**Shifting sands**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...