Now that I’m on my winter break, I’ve been taking a little bit of time to read up on some modelling techniques that I’ve never used before. Two such techniques are Random Forests and Conditional Trees. Since both can be used … Continue reading →

Well, to be specific, I mean measuring district compactness (a very interesting subject, see these three articles for starters). There are myriad ways of measuring the “oddness” of a shape, including a comparison of the area of the district to its circumcircle, the moment of inertia of the shape, the probability that a path connecting...

The for Dummies series has been around since 1991. (A bit of trivia, DOS for Dummies was the first title.) I’ve owned a few books in the series and have been adequately impressed with most of them, but when I learned there was an R for Dummies I was immediately skeptical. Possibly I was skeptical The post R...

Joining or merging two data sets is one of the most common tasks in preparing and analysing data. In fact a Google search returns 253 million results. However most examples assume that the columns that you want to merge by have the same names in both data sets which is often not the case. For example:

A quick geo-tip:With the osmar and maptools package you can easily pull an OpenStreetMap object and convert it to KML, like below (thanks to adibender helping out on SO). I found the relation ID by googling for it (www.google.at/search?q=openstreetmap+relation+innsbruck).# get OSM datalibrary(osmar)library(maptools)innsbruck sp_innsbruck # convert to KMLfor( i in seq_along(sp_innsbruck) ) { ...

Yesterday's release of Rcpp 0.10.2 required a small change to RcppClassic, the package supporting the deprecated older classic Rcpp API defined in the earlier 2005 to 2006 releases. So version 0.9.3 of RcppClassic is now on CRAN. There is no new user...

Principal Component Analysis (PCA) is a procedure that converts observations into linearly uncorrelated variables called principal components (Wikipedia). The PCA is a useful descriptive tool to examine your data. Today I will show how to find and visualize Principal Components. Let’s look at the components of the Dow Jones Industrial Average index over 2012. First,

Have you already used trees or random forests to model a relationship of a response and some covariates? Then you might like the condtional trees, which are implemented in the party package.In difference to the CART (Classification and Regression ...

Here at is.R(), we have produced countless posts that feature plots with confidence intervals, but apparently none of those are easy to find with Google. So, today, for the purposes of SEO, we’ve put “plotting confidence intervals” in the title of our post. We also cannot resist an earnest plea from our...