Further Adventures in Visualisation with ggplot2

April 25, 2011
By

(This article was first published on Psychwire » R, and kindly contributed to R-bloggers)

So I previously took a look at some data of player performance from a computer game. In this post, I’m going to do some further visualisations using ggplot2. The data consists of different types of player character, different roles for those characters, and their overall damage output (the unit here is damage per second, or DPS). To obtain the data, I took the top 40 highest scores from this website and pasted them into a spreadsheet (i.e., I didn’t try to kill their server by scraping the data, I copied it all by hand. How nice!).

So let’s begin. First, I want to take a look at some boxplots. But I don’t want them to be ordinary boxplots: I want them to be ordered by how well the players were able to score. So, I begin by sorting them by their median, and then plotting them.

 ```1 2 3 4 ``` ```ordered_spec = with(full_list, reorder(spec, DPS, median)) ggplot(full_list, aes(ordered_spec, DPS, fill = class)) + geom_boxplot() + opts(axis.text.x = theme_text(angle = 90, hjust = 0, size=7))```

The boxplot is produced from the simple geom_boxplot() command. To order the data, I used the reorder command, which reorders the spec factor according to the median of DPS . This then gets applied to the aesthetic mappings of the ggplot() command to reorder the output.

A quick note: initially when trying to reorder factors and output for plots, I tried to do this using ggplot itself. This was a mistake, as it’s not easy to do so. After much hunting around, I saw that it’s better to reorder you factors before you put them into ggplot, then the output will come out in the right order.

Anyway, here’s the graph:

You can see that there’s quite a range of performance. The poorer-performing groups are, for the most part, those who have other roles so shouldn’t be high on DPS. That is, all apart from subtlety, which is not doing so well. That too, really has another role, but it’s surprising to see it so low (I remember when it was quite good for DPS, about five years ago though now).

Next, let’s take a look at something slightly different. In the data, we also have the seconds column, which lets us know how many seconds a player was active for. Perhaps it’s the case that players get tired, so a plot of their performance by how long they were active for might be revealing. It may alternatively be the case that a shorter period of time will benefit players because they can use special abilities which increase their damage output – though these abilities can only be used every few minutes. This could mean that a player who uses all of their special abilities and then dies (so their time stops) may have a high DPS output.

 ```1 2 3 4 ``` ```ggplot(full_list_dps) +aes(x=seconds, y=DPS, colour=class) +geom_point() +scale_colour_hue()```

Here, we just need to specify the x and y axis values. The points are plotted using the geom_point() command. Colours are added using scale_colour_hue(). There are a wide variety of colour options that can be used. Here’s the graph:

There appears to be a large clustering together, though I guess it seems like there is a weak downwards trend. Let’s just run a correlation for the sake of it, shall we?

 ```1 ``` `cor.test(full_list_dps\$seconds, full_list_dps\$DPS)`

The output says there is a significant (p<.0001) negative correlation of -0.39.

Finally, let’s break it down and facet the output, so we can look at each class individually.

?View Code RSPLUS
 ```ggplot(full_list_dps)+aes(x=seconds, y=DPS, colour=spec)+ facet_wrap(facet=~class)+ geom_point()+ scale_colour_hue()```

That gives us this:

That’s all for now – up next will be some methods for summarising the data, followed by statistical tests (starting with ANOVAs, then moving onto LMEs). Again, just note that this is for fun, and not intended to be an accurate account of player performance by any remote stretch of the imagination.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Tags: , , , , , , ,