## Munkres’ Assignment Algorithm with RcppArmadillo

September 24, 2013
Munkres’ Assignment Algorithm (Munkres (1957), also known as hungarian algorithm) is a well known algorithm in Operations Research solving the problem to optimally assign N jobs to N workers. I needed to solve the Minimal Assignment Problem for a relabeling algorithm in MCMC sampling for finite mixture distributions, where I use a random permutation Gibbs sampler. For each sample...

## analyze the home mortgage disclosure act (hmda) microdata with r and monetdb

September 23, 2013
back in 1975, congress had it up to here with discriminatory lending practices and decided to require financial organizations originating home mortgages to report some basic operational statistics publicly.  the home mortgage disclosure act mandat...

## R GIS: Polygon Intersection with gIntersection{rgeos}

September 16, 2013
A short tutorial on doing intersections in R GIS. gIntersection{rgeos} will pick the polygons of the first submitted polygon contained within the second poylgon - this is done without cutting the polygon's edges which cross the clip source polygon. For the function that I use to download the example data, url_shp_to_spdf() please see HERE. library(rgeos)library(dismo)URLs...

## paste, paste0, and sprintf

September 14, 2013
I find myself pasting urls and lots of little pieces together lately. Now paste is a standard go to guy when you wanna glue some stuff together. But often I find myself pasting and getting stuff like this: Rather than … Continue reading →

## In case you missed it: August 2013 Roundup

September 11, 2013
In case you missed them, here are some articles from August of particular interest to R users: A tutorial on parallel programming with the foreach, doMC and doSNOW packages. Joe Rickert reviews R's capabilities for linear algebra, sparse matrices and big matrices. How R is disrupting the insurance industry with big data. Revolution Analytics has teamed with Cloudera to...

## Online course on forecasting using R

September 10, 2013
I am teaming up with Revolution Analytics to teach an online course on forecasting with R. Topics to be covered include seasonality and trends, exponential smoothing, ARIMA modelling, dynamic regression and state space models, as well as forecast accuracy methods and forecast evaluation techniques such as cross-validation. I will talk about some of my consulting experiences, and explain the...

## Sentiment Analysis on Twitter with Datumbox API

September 9, 2013
Hey there! After my post about sentiment analysis using the Viralheat API I found another service. Datumbox ist offering special sentiment analysis for Twitter. But this API doesn´t just offer sentiment analysis, it offers a much more detailed analysis. „The currently supported API functions are: Sentiment Analysis, Twitter Sentiment Analysis, Subjectivity Analysis, Topic Classification, Spam …

## September Talks

September 5, 2013
To celebrate my last full month on the East Coast, I’m doing a bunch of talks. If you’re interested in hearing more about Julia or statistics in general, you might want to come out to one of the events I’ll be at: Julia Tutorial at DataGotham: On 9/12, Stefan and I will be giving a

## How to build a single-node Hadoop/R system

September 3, 2013
The best way to learn any software is to use it, and if you're new to Hadoop and want to try using Hadoop with R the process of setting up your own Hadoop cluster can be daunting (to say the least). But if learning is the goal, the key is that you don't need to install a full cluster....

## Scheduling R Tasks with Crontabs to Conserve Memory

September 3, 2013
One of R’s biggest pitfalls is that eats up memory without letting it go.  This can be a huge problem if you are running really big jobs, have a lot of tasks  to run, or there are multiple users on your local computer or r server.  When I run huge jobs on my mac, I