Monthly Archives: January 2014

R for spatial analysis tutorial + video

January 30, 2014
By

On 24th January 2014 I ran a one day practical course on an "Introduction to Spatial Data Visualisation in R" at the University of Leeds, with the help of demonstrators Rachel Oldroyd and Alistair Leak, who came up from London for the event. The course is designed for people completely new to R, who are especially interested in its spatial...

Read more »

Introducing the ecoengine package

January 30, 2014
By
Introducing the ecoengine package

Natural history museums have long been valuable repositories of data on species diversity. These data have been critical for fostering and shaping the development of fields such as biogeography and systematics. The importance of these data repositories is becoming increasingly important, especially in the context of climate change, where a strong understanding of how species responded to past...

Read more »

Free books on statistical learning

January 29, 2014
By

Hastie, Tibshirani and Friedman’s Elements of Statistical Learning first appeared in 2001 and is already a classic. It is my go-to book when I need a quick refresher on a machine learning algorithm. I like it because it is written using the language and perspective of statistics, and provides a very useful entry point into the literature of machine...

Read more »

NYT’s 4th Down Bot gives the SuperBowl edge to the Broncos

January 29, 2014
By

Who will win the SuperBowl this Sunday: Seattle or Denver? As pundits around the country weigh in with their predictions, you might want to check out the analysis from the New York Times' 4th Down Bot, which compares the coaches' calls on fourth down plays with what historical statistics and a point-forecasting model indicate would have been the ideal...

Read more »

Inference for MA(q) Time Series

January 29, 2014
By
Inference for MA(q) Time Series

Yesterday, we’ve seen how inference for time series was possible.  I started  with that one because it is actually the simple case. For instance, we can use ordinary least squares. There might be some possible bias (see e.g. White (1961)), but asymptotically, estimators are fine (consistent, with asymptotic normality). But when the noise is (auto)correlated, then it is more...

Read more »

Data corruption in R 3.0.2 when using read.csv

January 29, 2014
By

Introduction It may be old news to some, but I just recently discovered that the automatic type inference system that R uses when parsing CSV files assumes that data sets will never contain 64-bit integer values. Specially, if an integer value read from a CSV file is too large to fit in a 32-bit integer

Read more »

Stupid R Tricks: Random Scope

January 29, 2014
By

Andrew and I have been discussing how we’re going to define functions in Stan for defining systems of differential equations; see our evolving ode design doc; comments welcome, of course. About Scope I mentioned to Andrew I would prefer pure lexical, static scoping, as found in languages like C++ and Java. If you’re not familiar The post

Read more »

Comparing multiple (g)lm in one graph #rstats

January 29, 2014
By
Comparing multiple (g)lm in one graph #rstats

It’s been a while since a user of my plotting-functions asked whether it would be possible to compare multiple (generalized) linear models in one graph (see comment). While it is already possible to compare multiple models as table output, I now managed to build a function that plots several (g)lm-objects in a single ggplot-graph. The

Read more »

Data mining with R course in the Netherlands taught by Luis Torgo

January 29, 2014
By

In the course of this year, Dr. Luis Torgo will teach a Data Mining with R course together with the DIKW Academy in Nieuwegein, The Netherlands. Dr. Torgo is an Associate Professor at the department of Computer Science at the… See more ›

Read more »

Inference for AR(p) Time Series

January 28, 2014
By
Inference for AR(p) Time Series

Consider a (stationary) autoregressive process, say of order 2, for some white noise with variance . Here is a code to generate such a process, > phi1=.25 > phi2=.7 > n=1000 > set.seed(1) > e=rnorm(n) > Z=rep(0,n) > for(t in 3:n) Z=phi1*Z+phi2*Z+e > Z=Z > n=length(Z) > plot(Z,type="l") Here, we have to estimate two sets of parameters: the autoregressive...

Read more »