Fumblings with Ranked Likert Scale Data in R

July 9, 2012

(This article was first published on OUseful.Info, the blog... » Rstats, and kindly contributed to R-bloggers)

The code is horrible and the visualisations quite possibly misleading, but I’m dead tired and there are a couple of tricks in the following R code that I want to remember, so here’s a contrived bit of fumbling with some data of the form:

enjoyCompany tooMuchFamily
1 strongly agree strongly disagree
2 strongly agree strongly disagree
3 neither agree nor disagree strongly disagree

That is, N rows, no identifiers, two columns; each column relates to a questionnaire question with a scaled response enumerated as ‘strongly agree’,’agree ‘,’neither agree nor disagree’,’disagree’,’strongly disagree’.

THe first thing I tried to do was some “traditional” Likert scale style stacked bar charts using ggplot2 (surely there must be a Likert scale visualisation library around? If so, how would it work with data in the above (and below) forms? Answers via the comments please…)

#My sample data doesn't have row based identifiers, so here's a hacked incremental index based ID
#melt the data into a dataframe with 3 cols: the id col, /b/; a /variable/ column that contains the original column heading; and a /value/ column that contains the original cell value for the corresponding row and column.
#Get rid of blank values
#Get rid of unused levels
#Reorder the levels into a meaningful order
ff$value <- factor(ff$value, levels =rev(c('strongly agree','agree ','neither agree nor disagree','disagree','strongly disagree')))
ggplot(ff)+geom_bar(aes(variable,fill=value))+ coord_flip()

A couple of notable issues with the resulting diagram:

– the colours aren’t that pleasing to look at;
– we have lost all sense of correlation between values. We may like to think that the agree/strongly agree ratings from one question are corrleated with the disagree/strongly disagree responses from the other, but there is nothing in that chart that says this for sure…

However, a pairwise comparison may help…

#Let's count how many times the different scale values occur with each other, and then plot some sort of correlation plot.
fs=subset(fs,enjoyCompany!='' & tooMuchFamily!='')
fs$enjoyCompany <- factor(fs$enjoyCompany, levels =rev(c('strongly agree','agree ','neither agree nor disagree','disagree','strongly disagree')))
fs$tooMuchFamily <- factor(fs$tooMuchFamily, levels =rev(c('strongly agree','agree ','neither agree nor disagree','disagree','strongly disagree')))

If I had rather more than two question columns, how would I generate a lattice of pairwise correlation charts to get a visual overview of the how all the question answers interact at the pairwise level?

To leave a comment for the author, please follow the link and comment on their blog: OUseful.Info, the blog... » Rstats.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...


Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)