Multiple plots in R: lesson zero

Posted on June 24, 2011 by axiomOfChoice in R bloggers | 0 Comments

[This article was first published on Musings of a forgetful functor, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Today, in one of my more productive days, I managed to create a sleek R script that plotted several histograms in a lattice, allowing for easy identification of the underlying trend. Although the majority of the time taken consisted of collecting the data and making various adjustments, it took a not inconsiderable amount of work to write the code.

As I was cursing the apply function – not for the last time I am sure – I suddenly realised the insane level of productivity that I have come to see as “par for the course”. The level of computational analysis that can be conducted in a few hours with nothing more than a desktop PC, a broadband connection, and copious amounts of caffeine is phenomenal. No longer can a lack of computing power or software be blamed for a lack of productivity growth: rate-limiting factors are now exclusively human.

Here at least is my contribution to the collective intelligence of biologicals. I have noticed that the most common reason that people avoid R is that they cannot rapidly make graphs that meet the high standards of their clients. In the next few editions of this blog we will build up a basic repertoire of plotting techniques, focusing on graphics that are sickeningly impressive.

Of course, Rome was not built in a day, and a thorough knowledge of R plotting cannot be built in one. Instead we will progress one layer at a time, adding additional levels of complexity and functionality. We start with a simple script that allows us to plot several graphs at the same time, each with a different value of a key variable. Here is the output:

Figure 1: Multiple plots using par

The code

# Clear all objects
rm(list=ls())

# Create a data set using random variables
df<-data.frame(x=rnorm(160),y=runif(160),a=sample(c(1,0,-1,10),
replace=TRUE,160))

df$z<-with(df,{3*a*y+x})

# Create a function that plots the value of "z" against the "y" value
plotM<-function(l){
  
  df.temp<-df[df$a==l,]
  plot(df.temp$y,df.temp$z,xlab="Y Value",ylab="Z Value",
  main=paste("Value of key variable: ",toString(l)))
  abline(lm(df.temp$z~df.temp$y),col="red")
  
}

# Create a grid to plot the different values of "a"
par(mfrow=c(2,2))

# Loop through each value of "a" and call the plotM function
for (i in c(1,0,-1,10)){
    plotM(i)
}

Walk-through
The first few lines of code create a data frame with 4 variables: x,y,z,a. Three of these variables are randomly generated, with the z variable dependent upon the other 3. Suppose that we are analysing this data set, unaware of the relationship between the x,y and z variables. A preliminary inspection shows that there are only 4 observed values of a. It seems sensible to plot z against y for each of these different a values.

The relevant code to create this plot starts with the function "plotM", standing for plot Multiple. This function takes an argument "l" that determines which value of the variable we are plotting. The line

df.temp<-df[df$a==l,]

filters the data frame to include only those rows where the variable a=l. The next line simply creates a standard R plot of y versus z. Finally, we use the abline function to plot a linear fit to highlight the trend. Too easy.

Now we come to the fun part. Using the par(mfrow=c(2,2)) we create a 2x2 grid in which to place the next four plots. Whatever plots we now create will be placed sequentially into this grid. Hence we can iterate over a vector containing the values of a, calling plotM each time. This gives us our grid.

Comments
Okay, so the graph does not look like it came from NASA – or to be honest with NASA's ailing reputation maybe it does. But note that once we have created the plotM function, we only have to write 3 lines of code to make 4 separate charts. Moreover, the code would not increase even if we were plotting 100 charts.

Of course, we have not yet even drawn upon any of R's custom plotting packages. In the next edition of this blog we will look at how to use the ggplot2 package to add colour and a wide range of other features to our graphs.

To leave a comment for the author, please follow the link and comment on their blog: Musings of a forgetful functor.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Multiple plots in R: lesson zero

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)