A while ago I wrote a co-wrote chapter for an introductory psychology text book Essential Psychology: A Concise Introduction. This is a book edited and written by members of the department where I work. My contribution was the chapter on human memory (cunningly titled Memory).
I produced several plots for the chapter (some of which got cut due to severe space restrictions). One that stayed in was a serial position curve. For this plot I used data from Postman and Phillips (1965).
I feel particular proud of this plot because I was just beginning to use and learn R at the time (as opposed to dabbling) and because I had had a really hard time getting hold of the data. I first tried google, but had no joy (for some reason I thought someone would have put the raw data online, as it is a classic study – though maybe I just missed it). Then I searched for alternative data sets (as around that period there were quite a few similar studies). I was probably being too picky, but whatever the reason I had no luck.
It would have been trivial to make up fake data, but that didn’t feel right. What I eventually did (and wished I’d done straight away) was print out the original figure and measure all the points by hand. I then entered these values into a spreadsheet and tweaked and remeasured until all the summary statistics matched those in the original paper to about one decimal place. This was a lot quicker than I had thought. I cheated slightly because I only needed data from the 20 word conditions (so I could leave out the 10 and 30 word conditions).
(I’m pretty sure I could have used computer software to capture the raw data from an image file, but I’d have had to find the software, learn how to use it and do all the checking anyway. For a single figure I’m reasonably sure measuring by hand would be faster.)
In re-plotting it I noticed a few things that I hadn’t paid much attention to before. The main one was the authors report frequency of recalls for 18 participants with 6 lists each. This means all scores are out of 108 and I suspect lots of casual readers would (like me) assume they were percentages. For re-plotting I rescaled the data as percentages.
The plot itself just uses basic R functions. I’m writing about it because: i) I think it is a fairly clear illustration of how basic plot functions in R can produce what I think is a rather nice Figure. (The published version has been edited by the publisher, adding colour and making the style match figures in other chapters), ii) people may find it useful for teaching purposes. So please feel free to use and adapt the R code for non-commercial (e.g., teaching use).
First load the data from this .csv file (you will need to specify the path or change the working directory if the file is saved elsewhere).
pp65 <- read.csv("pp65.csv“)
Then paste the following:
plot(pp65$SP, pch=NA, ylim=c(0,80), xlab= “Serial position”, ylab= “Mean percentage recall”, main = “Postman & Phillips (1965)”, sub = ‘(20 word conditions only)’)
points(pp65$C0, pch=19, col=’black’, cex=.7)
points(pp65$C15, pch=24, col=’black’, cex=.7)
points(pp65$C30, pch=22, col=’black’, cex=.7)
legend(3, 80, legend=c(“No delay”,”15 second delay”,”30 second delay”), lty=c(3,2,5))
If you are new to R you can find out more about these plotting functions by using R help: ?par, ?plot, ?points and so on …
Baguley, T., & Edmonds, A. J. (2010). Memory. In P. Banyard, M. N. O. Davies, C. Norman, & B. Winder (Eds.) Essential Psychology: A Concise Introduction (pp. 65-82). London: Sage.
Postman, L. & Philips, L. W. (1965). Short-term temporal changes in free recall. Quarterly Journal of Experimental Psychology, 17, 132-138.