Analysing comments to “Star Wars: The Last Jedi” – part 1

December 19, 2017
By

(This article was first published on Johannes Friedrich's R Blog, and kindly contributed to R-bloggers)

Fulfill your destiny –

“Star Wars: The Last Jedi” is in cinemas since December 14th (in Germany). I visited the midnight premiere with a double feature with Episode 7 and 8.
Today I want to present a combination of some things I love: Star Wars and R.

“Star Wars: The Lest Jedi” was discussed controversial in the community and I followed the discussion at the german Star Wars Union homepage.

I want to show briefly how the comments of the users developed during the last week. For the analysis I used the woderful R-package rvest and the packages included in the tidyverse, especially dplyr
and ggplot2.

rvest is used for web scraping, dplyr for data manipulation and ggplot2 for visualisation.

See here for the source code of the analysis.
The colour used for the first two plots is in the style of the Star Wars episode 8 colour and in ggplot2 it’s called red3.

Some results:

Comments per hour

Comments per hour

Users with the most comments

Most comments from users

I decided to list all users with more than 20 comments in total. My plan was to plot the total numbers of comments per user over all the days and see how the top users changed. But I decided to use this simple plot. I recommand the great R-package gganimate to dynamically show the development of a variable. See the project page for further examples.

Comments per day and hour

Comments per hour

The most comments were created between 7 pm and 8 pm or general: in the evening. It also obvious that on the 14th and 15th December the most comments in total were created.

I hope you had fun to see this statistics and see how easy it is to create such plots with ggplot2. Next time I want to show a wordcloud of the most common words used in the comments and further funny things …

May the force be with you!

To leave a comment for the author, please follow the link and comment on their blog: Johannes Friedrich's R Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)