Matrix scatterplot of the Airquality data using lattice

July 20, 2010
By

(This article was first published on Gosset's student » R, and kindly contributed to R-bloggers)

In this post we will build on the last one, and create a matrix scatterplot. The package lattice allows for some really excellent graphics. In case you haven’t already seen it I recommend the R Graph Gallery for some examples of what it can do – browse the graphics by package used to create them. We’ll use the same dataset as last time, where we made a plot of the NO levels in the atmosphere vs ozone levels for Nottingham, UK.

First step is to load the lattice package.

require("lattice")

Download the dataset from here, and put the file in your working directory. Now we’ll put the dataset into the matrix data.

columns <- c("date", "time", "NO", "NO_status", "NO_unit",
      "NO2", "NO2_status", "NO2_unit", "ozone", "ozone_status",
      "ozone_unit", "SO2", "SO2_status", "SO2_unit")
data <- read.csv("27899712853.csv", header = FALSE,
      skip = 7, col.names = columns, stringsAsFactors = FALSE)
x <- data$NO
y <- data$ozone
z <- data$SO2

So that it’s easier to follow, I’ve extracted 3 vectors from the matrix: x, y, and z.   These are the columns of the data for NO, ozone and SO2.  Hopefully this will help you follow things.  When working with graphs, I usually do this (in the last post I extracted x and y).  If I make a nice graphic I can then “cut and paste” it into another program, and just change the data in xy and z and hey presto, the same graphic is instantly used with new data.

For a matrix scatterplot, we need to make a matrix of the variables to compare. We join the vectors into a matrix and then name the columns.

mat <- cbind(x,y)
mat <- cbind(mat,z)
colnames(mat) <- c("NO", "ozone", "SO2")

You can look at the first 10 lines of mat with

mat[1:10,]

Finally we create the matrix plot:

title <- "Matrix scatterplot of air polutants"
print(splom(mat, main = title))

The final result is here:

For those unfamiliar with scatterplots – this plot is essentially 3 scatterplots of x vs y, x vs z and y vs z.  The middle left plot is the scatterplot created in this previous post.  The package lattice can do lots more than this – get help on line for it with the command

?lattice

Tagged: Lattice, Matrix scatterplot, R, statistics

To leave a comment for the author, please follow the link and comment on his blog: Gosset's student » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.