Articles by Tony Hirst

Scraping Web Pages With R

April 15, 2015 | Tony Hirst

One of the things I tend to avoid doing in R, partly because there are better tools elsewhere, is screenscraping. With the release of the new rvest package, I thought I’d have a go at what amounts to one of the simplest webscraping activites – grabbing HTML tables out of ...
[Read more...]

Wrangling Complex Spreadsheet Column Headers

April 14, 2015 | Tony Hirst

[This isn’t an R post, per se, but I’m syndicating it via RBloggers because I’m interested – how do you work with hierarchical column indices in R? Do you try to reshape the data to something tidier on the way in? Can you autodetect elements to help with ...
[Read more...]

Mixing Numbers and Symbols in Time Series Charts

April 8, 2015 | Tony Hirst

One of the things I’ve been trying to explore with my #f1datajunkie projects are ways of representing information that work both in a glanceable way as well as repaying deeper reading. I’ve also been looking at various ways of using text labels rather than markers to provide ...
[Read more...]

Segmenting F1 Qualifying Session Laptimes

March 29, 2015 | Tony Hirst

I’ve started scraping some FIA timing sheets again, including practice and qualifying session laptimes. One of the things I’d like to do is explore various ways of looking at the qualifying session laptimes, which means identifying which qualifying session each laptime falls into: For looking at session utilisation ...
[Read more...]

What’s the Point of an API?

March 9, 2015 | Tony Hirst

Trying to clear my head of code on a dog walk after a couple of days tinkering with the nomis API and I started to ponder what an API is good for. Chris Gutteridge and Alex Duttion’s open data excuses bingo card and Owen Boswarva’s Open Data Publishing ...
[Read more...]

So What Can Text Analysis Do for You?

March 2, 2015 | Tony Hirst

Despite believing we can treat anything we can represent in digital form as “data”, I’m still pretty flakey on understanding what sorts of analysis we can easily do with different sorts of data. Time series analysis is one area – the pandas Python library has all manner of handy tools ...
[Read more...]

Rediscovering Formula One Race Battlemaps

January 31, 2015 | Tony Hirst

A couple of days ago, I posted a recipe on the F1DataJunkie blog that described how to calculate track position from laptime data. Using that information, as well as additional derived columns such as the identity of, and time to, the cars immediately ahead of and behind a particular ...
[Read more...]

Calculating Churn in Seasonal Leagues

January 9, 2015 | Tony Hirst

One of the things I wanted to explore in the production of the Wrangling F1 Data With R book was the extent to which I could draw on published academic papers for inspiration in exploring the the various results and timing datasets. In a chapter published earlier this week, I ... [Read more...]

Sketching Scatterplots to Demonstrate Different Correlations

December 17, 2014 | Tony Hirst

Looking just now for an openly licensed graphic showing a set of scatterplots that demonstrate different correlations between X and Y values, I couldn’t find one. So here’s a quick R script for constructing one, based on a Cross Validated question/answer (Generate two variables with precise pre-specified ...
[Read more...]

Information Density and Custom Chart Designs

November 21, 2014 | Tony Hirst

I’ve been doodling today with a some charts for the Wrangling F1 Data With R living book, trying to see how much information I can start trying to pack into a single chart. The initial impetus came simply from thinking about a count of laps led in a particular ...
[Read more...]

Wrangling F1 Data With R – F1DataJunkie Book

October 30, 2014 | Tony Hirst

Earlier this year I started trying to pull together some of my #f1datajunkie R-related ramblings together in a book form. The project stalled, but to try to reboot it I’ve started publishing it as a living book over on Leanpub. Several of the chapters are incomplete – with TO ... [Read more...]
1 2 3 4 5 6 8

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)