Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the last quarter of last year, I started paying more attention to the use of custom grid lines and line guides in charts I’ve been developing for the Wrangling F1 Data With R book.

The use of line guides was in part inspired by canopy views from within the cockpit of one of the planes that makes up the Red Arrows aerobatic display team.

A little bit of digging suggested that the lines on the cockpit are actually an explosive cord used to shatter the cockpit if a pilot needs to eject from the aircraft – but I can’t believe that the pilots don’t also use the lines as a crib for helping position themselves with respect to the other aircraft in the team? (I emailed the Red Arrows press office to ask about the extent to which the cockpit lines are used in this way but got what amounted to a NULL response.)

Whatever the case, it seems eminently sensible to me that we make use of line guides to help us read charts more effectively, where it makes sense to, or to use guides as stepping stones to more refined data views.

The following example shows how we can generate a 2 dimensional grid based on the discrete points allocations possible for being placed in the top 10 positions of a Formula One race.

The grid lines show allowable points values, and are constructed as follows:

NEWPOINTS =c(25,18,15,12,10,8,6,4,2,1)

#The newpoints dataframe has two columns
#The first column indicates points available, in order
#The second column is the maximum number of points the lower placed driver could score
newpoints=data.frame(x=c(NEWPOINTS,0),y=c(NEWPOINTS[-1],0,0))

baseplot_singleWay=function(g){
g=g+xlab('Highest placed driver points')+ylab('Lower placed driver points')
g=g+scale_y_continuous(breaks = newpoints$x,minor_breaks=NULL) g=g+scale_x_continuous(breaks = newpoints$x,minor_breaks=NULL)
g=g+coord_fixed()
g=g+guides(size=FALSE)+theme_bw()
g
}

g=baseplot_singleWay(ggplot())
g


The final chart (of which this is a “one-sided” example) is used to display a count of the number of races in which at least one of the two drivers in a particular team scores points in a race. The horizontal x-axis represents the number of points scored by the highest placed driver in the race, and the y-axis the number of points scored by their lower placed team mate.

A fixed co-ordinate scheme is adopted to ensure that points are separated consistently on the x and y axes. A dotted red line shows the maximum number of points the lower placed driver in a team could scored depending on the number of points scored by their higher placed team mate.

#Add in the maximum points line guide

g+geom_line(data=newpoints,aes(x=x,y=y),col='red',linetype='dotted')


A two-sided version of the chart is also possible in which the x-axis represents the points scored in a particular race by one name driver and the y-axis represents the points scored by another named driver in the same race. The two-sided chart has two guidelines, representing the maximum points that can be scored by the other driver in the event of one particular driver being higher placed in a race.

A full description can be found in the Points Performance Charts chapter of the Wrangling F1 Data With R book.