Investigation the relationship between two variables using a scatter plot

October 13, 2009
By

(This article was first published on Software for Exploratory Data Analysis and Statistical Modelling, and kindly contributed to R-bloggers)

The relationship between two variables can be visually represented using a scatter plot and will provide some insight into the correlation between the variables and possible models to describe the relationship. There are different ways to produce scatter plots in R making use of either the base graphics system, the lattice graphics library, ggplot2 or other packages.

The R system has various data sets available for analysis, for example the Puromycin data which describes an experiment to study the relationship between reaction velocity and substrate concentration in an enzymatic reaction involving untreated cells or cells that were treated with Puromycin.

The variable rate can be plotted against the variable conc to investigate the relationship. Using the lattice package we can use the xyplot function to create a graph with the following code:

xyplot(rate ~ conc, data = Puromycin,
  xlab = "Substrate concentration (ppm)",
  ylab = "Reaction velocity (counts/min/min)",
  main = "Reaction velocity of an enzymatic reaction")

In this graph we do not distinguish between the untreated and treated cells and this code produces this graph:

Reaction Rate plotted versus Concentration for Puromycin data

Reaction Rate plotted versus Concentration for Puromycin data


We can make use of different plotting symbols to distinguish between the treated and untreated cells by the groups argument. We adjust the above code as follows:

xyplot(rate ~ conc, data = Puromycin,
  xlab = "Substrate concentration (ppm)",
  ylab = "Reaction velocity (counts/min/min)",
  main = "Reaction velocity of an enzymatic reaction",
  groups = state)

Here the graph is now:

Plot of the Reaction Rate against Concentration by Treatment

Plot of the Reaction Rate against Concentration by Treatment


An alternative would make use of the panelling facilities in lattice graphics to plot the data for the treated and untreated cells separately.

To leave a comment for the author, please follow the link and comment on his blog: Software for Exploratory Data Analysis and Statistical Modelling.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , ,

Comments are closed.