## Fitting models to short time series

March 3, 2014
Following my post on fitting models to long time series, I thought I’d tackle the opposite problem, which is more common in business environments. I often get asked how few data points can be used to fit a time series model. As with almost all sample size questions, there is no easy answer. It depends on the number of model parameters...

## Useful Functions in R for Manipulating Text Data

Introduction In my current job, I study HIV at the genetic and biochemical levels.  Thus, I often work with data involving the sequences of nucleotides or amino acids of various patient samples of HIV, and this type of work involves a lot of manipulating text.  (Strictly speaking, I analyze sequences of nucleotides from DNA that are reverse-transcribed from

## Job Trends in the Analytics Market: New, Improved, now Fortified with C, Java, MATLAB, Python, Julia and Many More!

February 24, 2014
I’m expanding the coverage of my article, The Popularity of Data Analysis Software. This is the first installment, which includes a new opening and a greatly expanded analysis of the analytics job market. Here it is, from the abstract onward … Continue reading →

## The forecast mean after back-transformation

February 24, 2014
Many functions in the forecast package for R will allow a Box-Cox transformation. The models are fitted to the transformed data and the forecasts and prediction intervals are back-transformed. This preserves the coverage of the prediction intervals, and the back-transformed point forecast can be considered the median of the forecast densities (assuming the forecast densities on the transformed scale...

## Brief introduction on Sweave and Knitr for reproducible research

February 24, 2014
$Brief introduction on Sweave and Knitr for reproducible research$

A few weeks ago I gave a presentation on using Sweave and Knitr under the guise of promoting reproducible research. I humbly offer this presentation to the blog with full knowledge that there are already loads of tutorials available online. This presentation is specific and slightly biased towards Windows OS, so it probably has limited

## Forecasting within limits

February 21, 2014
It is common to want forecasts to be positive, or to require them to be within some specified range . Both of these situations are relatively easy to handle using transformations. Positive forecasts To impose a positivity constraint, simply work on the log scale. With the forecast package in R, this can be handled by specifying the Box-Cox parameter...

## Backcasting in R

February 19, 2014
Sometimes it is useful to “backcast” a time series — that is, forecast in reverse time. Although there are no in-built R functions to do this, it is very easy to implement. Suppose x is our time series and we want to backcast for periods. Here is some code that should work for most univariate time series. The example...

## Voting Twice in France

February 19, 2014
On the Monkey Cage blog, Baptiste Coulmont (a.k.a. @coulmont) recently uploaded a post entitled “You can vote twice ! The many political appeals of proxy votes in France“, coauthored with Joël Gombin (a.k.a. @joelgombin), and myself. The study was initially written in French as mentioned in a previous post. Baptiste posted additional information on his blog (http://coulmont.com/blog/…) and I also wanted to post some lines of code,...

## Global energy forecasting competitions

February 19, 2014
The 2012 GEFcom competition was a great success with several new innovative forecasting methods introduced. These have been published in the IJF as follows: Hong, Pinson and Fan. Global Energy Forecasting Competition 2012 Charleton and Singleton. A refined parametric model for short term load forecasting Lloyd. GEFCom2012 hierarchical load forecasting: Gradient boosting machines and Gaussian processes Nedelec, Cugliari and Goude: GEFCom2012: Electric...

