Blog Archives

Code as Magic, and the Vernacular of Data Wrangling Verbs

February 11, 2015
By
Code as Magic, and the Vernacular of Data Wrangling Verbs

It’s been some time now since I drafted most of my early unit contributions to the TM351 Data management and analysis course. Part of the point (for me) in drafting that material was to find out what sorts of thing we actually wanted to say and help identify the sorts of abstractions we wanted to

Read more »

Rediscovering Formula One Race Battlemaps

January 31, 2015
By
Rediscovering Formula One Race Battlemaps

A couple of days ago, I posted a recipe on the F1DataJunkie blog that described how to calculate track position from laptime data. Using that information, as well as additional derived columns such as the identity of, and time to, the cars immediately ahead of and behind a particular selected driver, both in terms of

Read more »

Connecting RStudio and MySQL Docker Containers – an example using the ergast db

January 17, 2015
By
Connecting RStudio and MySQL Docker Containers – an example using the ergast db

building on Dockerising Open Data Databases – First Fumblings and my Book Extras – Data Files, Code Files and a Dockerised Application, I just figured out how to get the ergast db into a MySQL docker container and then query it from RStudio: Download and unzip the f1db.sql.gz file to f1db.sql install these docker-mysql-scripts run

Read more »

Calculating Churn in Seasonal Leagues

January 9, 2015
By
Calculating Churn in Seasonal Leagues

One of the things I wanted to explore in the production of the Wrangling F1 Data With R book was the extent to which I could draw on published academic papers for inspiration in exploring the the various results and timing datasets. In a chapter published earlier this week, I explored the notion of churn,

Read more »

Book Extras – Data Files, Code Files and a Dockerised Application

January 5, 2015
By
Book Extras – Data Files, Code Files and a Dockerised Application

Idling through the LeanPub documentation last night, I noticed that they support the ability to sell digital extras, such as bundled code files or datafiles. Along with the base book sold at one price, additional extras can be bundled into packages alongside the original book and sold at another (higher) price. As with the book

Read more »

Custom Gridlines and Line Guides in R/ggplot Charts

January 2, 2015
By
Custom Gridlines and Line Guides in R/ggplot Charts

In the last quarter of last year, I started paying more attention to the use of custom grid lines and line guides in charts I’ve been developing for the Wrangling F1 Data With R book. The use of line guides was in part inspired by canopy views from within the cockpit of one of the

Read more »

Sketching Scatterplots to Demonstrate Different Correlations

December 17, 2014
By
Sketching Scatterplots to Demonstrate Different Correlations

Looking just now for an openly licensed graphic showing a set of scatterplots that demonstrate different correlations between X and Y values, I couldn’t find one. So here’s a quick R script for constructing one, based on a Cross Validated question/answer (Generate two variables with precise pre-specified correlation): And here’s an example of the result:

Read more »

Identifying Position Change Groupings in Rank Ordered Lists

December 9, 2014
By
Identifying Position Change Groupings in Rank Ordered Lists

The title says it all, doesn’t it?! Take the following example – it happens to show race positions by driver for each lap of a particular F1 grand prix, but it could be the evolution over time of any rank-based population. The question I had in mind was – how can I identify positions that

Read more »

Information Density and Custom Chart Designs

November 21, 2014
By
Information Density and Custom Chart Designs

I’ve been doodling today with a some charts for the Wrangling F1 Data With R living book, trying to see how much information I can start trying to pack into a single chart. The initial impetus came simply from thinking about a count of laps led in a particular race by each drive; this morphed

Read more »

F1 Championship Race, 2014 – Winning Combinations…

November 8, 2014
By
F1 Championship Race, 2014 – Winning Combinations…

As we come up to the final two races of the 2014 Formula One season, the double points mechanism for the final race means that two drivers are still in with a shot at the Drivers’ Championship: Lewis Hamilton and Nico Rosberg. As James Allen describes in Hamilton closes in on world title: maths favour

Read more »