Blog Archives

Sketching Scatterplots to Demonstrate Different Correlations

December 17, 2014
By
Sketching Scatterplots to Demonstrate Different Correlations

Looking just now for an openly licensed graphic showing a set of scatterplots that demonstrate different correlations between X and Y values, I couldn’t find one. So here’s a quick R script for constructing one, based on a Cross Validated question/answer (Generate two variables with precise pre-specified correlation): And here’s an example of the result:

Read more »

Identifying Position Change Groupings in Rank Ordered Lists

December 9, 2014
By
Identifying Position Change Groupings in Rank Ordered Lists

The title says it all, doesn’t it?! Take the following example – it happens to show race positions by driver for each lap of a particular F1 grand prix, but it could be the evolution over time of any rank-based population. The question I had in mind was – how can I identify positions that

Read more »

Information Density and Custom Chart Designs

November 21, 2014
By
Information Density and Custom Chart Designs

I’ve been doodling today with a some charts for the Wrangling F1 Data With R living book, trying to see how much information I can start trying to pack into a single chart. The initial impetus came simply from thinking about a count of laps led in a particular race by each drive; this morphed

Read more »

F1 Championship Race, 2014 – Winning Combinations…

November 8, 2014
By
F1 Championship Race, 2014 – Winning Combinations…

As we come up to the final two races of the 2014 Formula One season, the double points mechanism for the final race means that two drivers are still in with a shot at the Drivers’ Championship: Lewis Hamilton and Nico Rosberg. As James Allen describes in Hamilton closes in on world title: maths favour

Read more »

Wrangling F1 Data With R – F1DataJunkie Book

October 30, 2014
By
Wrangling F1 Data With R – F1DataJunkie Book

Earlier this year I started trying to pull together some of my #f1datajunkie R-related ramblings together in a book form. The project stalled, but to try to reboot it I’ve started publishing it as a living book over on Leanpub. Several of the chapters are incomplete – with TO DO items sketched in, others are

Read more »

Running “Native” Data Wrangling Applications in the Browser – IPython Notebooks (and R?) in Chrome

August 22, 2014
By
Running “Native” Data Wrangling Applications in the Browser – IPython Notebooks (and R?) in Chrome

Using browser based data analysis toolkits such as pandas in IPython notebooks, or R in RStudio, means you need to have access to python or R and the corresponding application server either on your own computer, or running on a remote server that you have access to. When running occasional training sessions or workshops, this

Read more »

Opening Up Access to Data: Why APIs May Not Be Enough…

August 11, 2014
By
Opening Up Access to Data: Why APIs May Not Be Enough…

Last week, a post on the ONS (Office of National Statistics) Digital Publishing blog caught my eye: Introducing the New Improved ONS API which apparently “mak things much easier to work with”. Ooh… exciting…. maybe I can use this to start hacking together some notebooks?:-) It was followed a few days later by this one

Read more »

F1 Doing the Data Visualisation Competition Thing With Tata?

July 2, 2014
By
F1 Doing the Data Visualisation Competition Thing With Tata?

Sort of via @jottevanger, it seems that Tata Communications announces the first challenge in the F1® Connectivity Innovation Prize to extract and present new information from Formula One Management’s live data feeds. (The F1 site has a post Tata launches F1® Connectivity Innovation Prize dated “10 Jun 2014″? What’s that about then?) Tata Communications are

Read more »

Recreational Data: Data Golf

May 23, 2014
By
Recreational Data: Data Golf

I’m still hopeful of working up the idea of recreational data as a popular pastime activity with a regular column somewhere and a stocking filler book each Christmas (?!;-), but haven’t had much time to commit to working up some great examples lately:-( However, here’s a neat idea – data golf – as described in

Read more »

Visualising Pandas DataFrames With IPythonBlocks – Proof of Concept

March 26, 2014
By
Visualising Pandas DataFrames With IPythonBlocks – Proof of Concept

A few weeks ago I came across IPythonBlocks, a Python library developed to support the teaching of Python programming. The library provides an HTML grid that can be manipulated using simple programming constructs, presenting the outcome of the operations in a visually meaningful way. As part of a new third level OU course we’re putting

Read more »