warning: the instructions below are obsolete. please check this page for the latest version. <3 anthony whya speed test of three sql queries on sixty-seven million records using my personal computer --# calculate the sum, mean, median, a...

At a meeting last night with some collaborators at the Vélobstacles project, I was excitedly told about the magic of IPython and it’s notebook functionality for reproducible research. This sounds familiar, I thought to myself. Using a literate programming approach to integrate computation with the communication of methodology and results has been at the core

Some time ago I stumbled upon a problem connected with the labels of a clustering. The partition an instance belongs to, is mostly labeled through an integer ranging from 1 to K, where k is the number of clusters. The task at that time was to plot a map of the results from the clustering of spatial polygons...

IntroductionSo a little while ago I quit my job.Well, actually, that sounds really negative. I'm told that when you are discussing large changes in your life, like finding a new career, relationship, or brand of diet soda, it's important to frame things positively.So let me rephrase that - I've left job I previously held to pursue other directions. Why?...

It's very convenient manage data with R: you can import your dataset, you could find many packages which respond to your needs, then you could plot your results. However it could be very bothersome retrieve the data from online databases. … Continue reading →

In part inspired by the chart described in The electoral map sans the map, I thought I’d start mulling over a quick sketch showing the race to the 2012 Formula One Drivers’ Championship. The chart needs to show tension somehow, so in this first really quick and simple rough sketch, you really do have to