Blog Archives

As a Data Scientist it is my Obligation to support #nobagida, #nopegida and any other #no[a-z]{2}gida today :)

January 19, 2015
By
As a Data Scientist it is my Obligation to support #nobagida, #nopegida and any other #no[a-z]{2}gida today :)

 

Read more »

Germans used to have more Sex in Summer!

January 1, 2015
By
Germans used to have more Sex in Summer!

Wow – what a headline … okay, I admit it’s phrased quite sensational given that it anticipates just one possible interpretation of increasingly more births around summer / autumn compared to in spring … but I guess I just get … Continue reading →

Read more »

Hierarchical Clustering with R (feat. D3.js and Shiny)

December 14, 2014
By
Hierarchical Clustering with R (feat. D3.js and Shiny)

Agglomerative hierarchical clustering is a simple, intuitive and well-understood method for clustering data points. I used it with good results in a project to estimate the true geographical position of objects based on measured estimates. With this tutorial I would … Continue reading →

Read more »

Twitter’s REST API v1.1 with R (for Linux and Windows)

September 22, 2014
By
Twitter’s REST API v1.1 with R (for Linux and Windows)

In this tutorial I am going to describe a straightforward way of how to make use of Twitter’s REST API v1.1. For that purpose I composed a little package (RTwitterAPI), so that requesting data just needs the API URL, the API parameters … Continue reading →

Read more »

MongoDB – State of the R

August 31, 2014
By
MongoDB – State of the R

Naturally there are two reasons for why you need to access MongoDB from R: MongoDB is already used for whatever reason and you want to analyze the data stored therein You decide you want store your data in MongoDB instead of … Continue reading →

Read more »

Reasonable Inheritance of Cluster Identities in Repetitive Clustering

August 15, 2014
By
Reasonable Inheritance of Cluster Identities in Repetitive Clustering

… or Inferring Identity from Observations Let’s assume the following application: A conservation organisation starts a project to geographically catalogue the remaining representatives of an endangered plant species. For that purpose hikers are encouraged to communicate the location of the plant … Continue reading →

Read more »

Talking to Twitter’s REST API v1.1 with R

June 10, 2014
By
Talking to Twitter’s REST API v1.1 with R

In this text I am going to describe a very straightforward way of how to make use of Twitter’s REST API v1.1. I put some code together for that purpose, so that requesting data just needs the API URL, the API … Continue reading →

Read more »

FIR Filter Design and Digital Signal Processing in R

May 15, 2014
By
FIR Filter Design and Digital Signal Processing in R

This article serves the purpose of illustrating that signal processing with R is possible – thanks to the signal package – and to keep a reference of some of the stuff that I learned at my last edX course. Anyway, I … Continue reading →

Read more »

Relation of Word Order and Compression Ratio and Degree of Structure

May 7, 2014
By
Relation of Word Order and Compression Ratio and Degree of Structure

Having a habit of compulsively wondering approximately every 34.765th day about how zip compression (bzip2 in this case) might be used to measure information contained in data – this time the question popped up in my head of whether or … Continue reading →

Read more »

MapReduce with R on Hadoop and Amazon EMR

April 25, 2014
By
MapReduce with R on Hadoop and Amazon EMR

You all know why MapReduce is fancy – so let’s just jump right in. I like researching data and I like to see results fast – does that mean I enjoy the process of setting up a Hadoop cluster? No, … Continue reading →

Read more »