Word Clouds in R
[This article was first published on Mollie's Research Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Thanks to the wordcloud package, it’s super easy to make a word cloud or tag cloud in R.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In this case, the words have been counted already. If you are starting with plain text, you can use the text mining package tm to obtain the counts. Other bloggers have provided good examples of this. I’ll just be covering the simple case where we already have the frequencies.
Let’s look at some commonly used words during the National Conventions this year. The New York Times produced a cool infographic that we’ll use as our data source. The data in csv format (and the R code too) are available in a gist.
First we need to load up the packages and our data:
library(wordcloud) library(RColorBrewer) conventions <- read.table("conventions.csv", header = TRUE, sep = ",")
And then we can get to using the wordcloud library to produce our clouds in R:
png("dnc.png") wordcloud(conventions$wordper25k, # words conventions$democrats, # frequencies scale = c(4,1), # size of largest and smallest words colors = brewer.pal(9,"Blues"), # number of colors, palette rot.per = 0) # proportion of words to rotate 90 degrees dev.off() png("rnc.png") wordcloud(conventions$wordper25k, conventions$republicans, scale = c(4,1), colors = brewer.pal(9,"Reds"), rot.per = 0) dev.off()
DNC word cloud |
RNC word cloud |
The default word cloud has some words rotated 90 degrees, but I prefer to use rot.per = 0 to make them all horizontal for readability.
You can easily change to just one color if you prefer that since the size already denotes the frequency of the word, by changing color to "red3", for example:
RNC single color |
png("rncalt.png") wordcloud(conventions$wordper25k, conventions$republicans, scale = c(4,1), colors = "red3", rot.per = 0) dev.off()
DNC single color |
To leave a comment for the author, please follow the link and comment on their blog: Mollie's Research Blog.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.