**"R-bloggers" via Tal Galili in Google Reader**, and kindly contributed to R-bloggers)

I guess this is not the number one post I would like to start with on this blog, but I feel the time is right for it (community-wise).

I’ll move on to the subject matter in a moment, but first a short intro: This blog is written by Tal Galili. I am an aspiring statistician who also loves to use R for his work. At the same time I am also a WordPress blogger, writing mainly at www.TalGalili.com where I can use my native language (Hebrew) for self expression.

This combination of **statistics **and **blogging** will lead me to sometimes much less statistical, but more Web/Open-Source oriented posts like this one. So for the statisticians in the audience I extend my apologies and invite you to wait for future posts which will be more fully focused on **Statistics and R**.

And now for the topic at hand. . .

* * * * *

I have just noticed the nice article published on the wordpress development blog titled “2.9 Features Vote Results“. The post exemplifies a wonderful trend in the WordPress community (led by Jane Wells) having to do with connecting between the core team and the WordPress user community. The way Jane does this is by giving surveys to WordPress users, which in turn offers the WordPress core team an opportunity to understand the community needs.

In the post “2.9 Features Vote Results“, Jane presented the results of such survey. The post had tables and barplots, but the barplots were only present for the one dimensional variables. In contrast, more elaborate data, such as that of question 2 (asking to rate each of 11 potential features on a scale of 1 to 5), was shown only with a table, such as this:

The table gives the full information (although I would love it if it was easily downloadable, instead of having to type in the numbers) – but its main limitation is that from a quick look, one can not easily get (let alone understand) anything.

For the goal of understanding more of the results with a quick glance, I offer two simple and well-known visualizations for the results.

1) Parallel barplots (click for bigger image)

This plot can be easily implemented in Excel (although I did it in R) and can allow us to compare the different ranking each potential feature received.

For example, this shows us that most answers were usually given rank 4 (”would be nice”) for each feature.

2) Mosaic plot (click for bigger image)

I don’t know if this can be done in Excel, but with R it is just a simple line of code.

(*mosaicplot((DataSet.table), las = 1, col = c(”gray”,”gray”,”blue”,3,”dark green”), main = “”)*)

The advantage of this plot is that it allows us to compare the different features easily, while not only comparing the top rank, but also combining different rankings for easy comparison (for example, comparing how many rank 4 or 5 each feature received).

So for example, the plot shows me that the most ranked with number 5 was the feature “easier embeds” but the most ranked “number 4 or 5″ was the feature “custom image sizes”. The feature “media album” came close to these two, but didn’t top either.

**Conclusions from this post**:

- It would be nice (if possible) to publish the full data of the surveys, not just the results.
- The second question of the survey gives different answer than the first question. But since the difference in percentage seems to be so small compared to the other options, I would guess that all of the top 4 features are more or less in the same level of interest to the community.

p.s. to Jane – why do none of the numbers in this table add up to 3406 (the number of respondants) ?

p.p.s. to Jane and the Dev team – great work people!

Share with friends:

**leave a comment**for the author, please follow the link and comment on their blog:

**"R-bloggers" via Tal Galili in Google Reader**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...