Blog Archives

Segmenting F1 Qualifying Session Laptimes

March 29, 2015
By
Segmenting F1 Qualifying Session Laptimes

I’ve started scraping some FIA timing sheets again, including practice and qualifying session laptimes. One of the things I’d like to do is explore various ways of looking at the qualifying session laptimes, which means identifying which qualifying session each laptime falls into: For looking at session utilisation charts I’ve been making use of accumulated

Read more »

What’s the Point of an API?

March 9, 2015
By
What’s the Point of an API?

Trying to clear my head of code on a dog walk after a couple of days tinkering with the nomis API and I started to ponder what an API is good for. Chris Gutteridge and Alex Duttion’s open data excuses bingo card and Owen Boswarva’s Open Data Publishing Decision Tree both suggest that not having

Read more »

So What Can Text Analysis Do for You?

March 2, 2015
By
So What Can Text Analysis Do for You?

Despite believing we can treat anything we can represent in digital form as “data”, I’m still pretty flakey on understanding what sorts of analysis we can easily do with different sorts of data. Time series analysis is one area – the pandas Python library has all manner of handy tools for working with that sort

Read more »

Tools in Tandem – SQL and ggplot. But is it Really R?

February 28, 2015
By
Tools in Tandem – SQL and ggplot. But is it Really R?

Increasingly I find that I have fallen into using not-really-R whilst playing around with Formula One stats data. Instead, I seem to be using a hybrid of SQL to get data out of a small SQLite3 datbase and into an R dataframe, and then ggplot2 to render visualise it. So for example, I’ve recently been

Read more »

Code as Magic, and the Vernacular of Data Wrangling Verbs

February 11, 2015
By
Code as Magic, and the Vernacular of Data Wrangling Verbs

It’s been some time now since I drafted most of my early unit contributions to the TM351 Data management and analysis course. Part of the point (for me) in drafting that material was to find out what sorts of thing we actually wanted to say and help identify the sorts of abstractions we wanted to

Read more »

Rediscovering Formula One Race Battlemaps

January 31, 2015
By
Rediscovering Formula One Race Battlemaps

A couple of days ago, I posted a recipe on the F1DataJunkie blog that described how to calculate track position from laptime data. Using that information, as well as additional derived columns such as the identity of, and time to, the cars immediately ahead of and behind a particular selected driver, both in terms of

Read more »

Connecting RStudio and MySQL Docker Containers – an example using the ergast db

January 17, 2015
By
Connecting RStudio and MySQL Docker Containers – an example using the ergast db

building on Dockerising Open Data Databases – First Fumblings and my Book Extras – Data Files, Code Files and a Dockerised Application, I just figured out how to get the ergast db into a MySQL docker container and then query it from RStudio: Download and unzip the f1db.sql.gz file to f1db.sql install these docker-mysql-scripts run

Read more »

Calculating Churn in Seasonal Leagues

January 9, 2015
By
Calculating Churn in Seasonal Leagues

One of the things I wanted to explore in the production of the Wrangling F1 Data With R book was the extent to which I could draw on published academic papers for inspiration in exploring the the various results and timing datasets. In a chapter published earlier this week, I explored the notion of churn,

Read more »

Book Extras – Data Files, Code Files and a Dockerised Application

January 5, 2015
By
Book Extras – Data Files, Code Files and a Dockerised Application

Idling through the LeanPub documentation last night, I noticed that they support the ability to sell digital extras, such as bundled code files or datafiles. Along with the base book sold at one price, additional extras can be bundled into packages alongside the original book and sold at another (higher) price. As with the book

Read more »

Custom Gridlines and Line Guides in R/ggplot Charts

January 2, 2015
By
Custom Gridlines and Line Guides in R/ggplot Charts

In the last quarter of last year, I started paying more attention to the use of custom grid lines and line guides in charts I’ve been developing for the Wrangling F1 Data With R book. The use of line guides was in part inspired by canopy views from within the cockpit of one of the

Read more »