Kung Fu R

January 26, 2017

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

A great way to hone your skills as a data scientist is to pick a topic you're passionate about, find some data related to it, and analyze the heck out of it. Jim Vallandingham is clearly passionate about old Kung Fu movies — particularly those from the Shaw Brothers Studio — and has used R to analyze data the studio's oeuvre: 260 films over 22 years.

The complete R code behind the analysis is included in the post (and you can also find it as an R Markdown document here). Some interesting notes include:

  • Use of the tidyjson package to parse the table scraped from the list of Shaw Brothers Martial Arts Films on Letterboxd.
  • Many applications of the ggplot2 package to visualize directors, actors, film length, Letterboxd watches and likes, and other data about the films.
  • Using the tidytext package to find the common words used in film titles.
  • Using the igraph package to create a network digram showing actors who appear in films together, shown below.


You can even iteract with this network, using Jim's Shaw Brothers Actors Network Visualization. The interactive version is based on Javascript and D3, and this tutorial explains how it was made. But for the complete analysis of the Kung Fu movies, follow the link below.

Jim Vallandingham: A Data Driven Exploration of Kung Fu Films

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)