Twitter Analysis of the US Presidential Debate

stathack

9 years ago

[This article was first published on NERD PROJECT » R project posts, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The following are word clouds of tweets for each candidate from the October 16, 2012 debate with the bigger words the more often they were used in tweets (click on each word cloud to enlarge):

And the net-negative posts for each candidate:

Please note that the bigger the word is in the word cloud the more often it was used.

The R code for creating the word clouds

The following code was adapted from here and is an extension of previous work:

ap.corpus <- Corpus(DataframeSource(data.frame(as.character(romneypositive[,3])))) ap.corpus <- tm_map(ap.corpus, removePunctuation) ap.corpus <- tm_map(ap.corpus, tolower) ap.corpus <- tm_map(ap.corpus, function(x) removeWords(x, c(r,stopwords("english")))) ap.tdm <- TermDocumentMatrix(ap.corpus) ap.m <- as.matrix(ap.tdm) ap.v <- sort(rowSums(ap.m),decreasing=TRUE) ap.d <- data.frame(word = names(ap.v),freq=ap.v) table(ap.d$freq) pal2 <- brewer.pal(8,"Dark2") png("romneypositive.png", width=1280,height=800) wordcloud(ap.d$word,ap.d$freq, scale=c(8,.2),min.freq=3, max.words=Inf, random.order=FALSE, rot.per=.15, colors=pal2) dev.off()

Disclaimer: Any errors can be attributed to the fact that I was drinking heavily, that I was in Dallas, and that it was half four in the morning when I finished writing this.

To leave a comment for the author, please follow the link and comment on their blog: NERD PROJECT » R project posts.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.