I have been reading more and more about how people can’t interpret charts… which kinda never occurred to me, if I’m gonna be very honest. Anyway, it kind of made me think of actually testing people informally, to see for myself. So I’ve been doing just that: showing colleagues, friends, etc a chart that we created interactively during the first Accra R-Users session with tons of detail, and asking them to analyze it at length. The results have been staggering! I’m still trying to generalize my conclusions, but thought it would be fun to open up this test to the community, so here it goes! If you feel like sharing, post your observations in the comments section.
“The following chart shows the ratings (imdb) for ~60k movies throughout the years. Movies are divided by their genre (in the case a movie has multiple genres it shows up in all genres), and their budgets are shown in color. All movies are shown as mostly transparent so darker patches mean more movies. Talk for 3 minutes about what this chart is showing, try to explain stuff, and think of what other analysis should follow.”
(click to magnify)
The R code to get this chart follows, or you could find the entire exploratory exercise in the github page.
library(ggplot2) library(ggplot2movies) library(tidyr) library(dplyr) ## Gather up all ratings into one column, then use that to divide up the movies dataframe and plot movies %>% select(-(r1:r10)) %>% gather(key = genre,val , Action:Short) %>% filter(val==1) %>% ggplot(aes(x=year,y=rating,color=budget,label=title))+geom_point(alpha=0.1)+facet_wrap(~genre) + scale_color_gradient(low="red",high="green") + ggtitle("Movie ratings by year")