Are they happy at school? PISA data and visualisation challange @ useR!2014

This post was kindly contributed by SmarterPoland » R - go there to comment and to read the full post.

I hope you have already heard about DataVis contest at useR!2014 (in case you did not, here is more information: http://www.oecd.org/pisa/pisaproducts/datavisualizationcontest.htm).
As a jury member I am not going to submit any work, but I am going to encourage you to play with the data, crunch a finding and submit interesting plot.

How easy is to download the data and create some graph?

Let’s take a look at students’ answers for questions: ‘Do you feel happy at school?’ and ‘How familiar are you with Declarative Fractions?’.

Are boys more familiar with ‘Declarative Fractions’ than girls?
(funny fact: ‘Declarative Fraction’ is a fake concept that is used in PISA to measure overconfidence)
(funny fact 2: plot familiarity with ‘Declarative Fractions’ across countries, results are very interesting)

library(ggplot2)
library(reshape2)
library(likert)
library(intsvy)
 
# read students data from PISA 2012
# directly from URL
con <- url("http://beta.icm.edu.pl/PISAcontest/data/student2012.rda")
load(con)
 
# variable ST62Q13 codes familiarity with 'Declarative Fraction'
tab <- pisa.table(variable="ST62Q13", by="ST04Q01", data=student2012)
ptab <- acast(tab, ST04Q01~ST62Q13, value.var="Percentage")
ddat <- data.frame(Item=rownames(ptab), ptab)
 
# plot it with the likert package
likert.bar.plot(likert(summary=ddat), center=2) + ggtitle("Declarative Fraction") + theme_bw()

In which countries the fraction of kids that declare that they are happy at school is the highest?

# variable ST87Q07 codes 'Sense of Belonging - Feel Happy at School'
# pisa.table calculates weighted fractions and it's standard errors
tab <- pisa.table(variable="ST87Q07", by="CNT", data=student2012)
ptab <- acast(tab, CNT~ST87Q07, value.var="Percentage")
 
# need to add fake column with 0, otherwise likert.bar.plot will fail
ddat <- data.frame(Item=rownames(ptab), cbind(ptab[,4:3], 0, ptab[,2:1]))
ddat <- ddat[-grep(rownames(ddat),pattern="(", fixed=TRUE),]
 
# plot it with the likert package
likert.bar.plot(likert(summary=ddat),  plot.percent.neutral=FALSE) + ggtitle("Sense of Belonging - Feel Happy at School")

There is a lot of ‘kind of Likert’ scale questions in the PISA study, for sure there is a lot of nice stories as well.