Analyzing movie connections with R

January 4, 2016

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

One of the themes of the Christmas movie classic Love Actually is the interconnections between people of different communities and cultures, from the Prime Minister of the UK to a young student in London. StackOverflow's David Robinson brings these connections to life by visualizing the network diagram of 20 characters in the movie, based on scenes in which they appear together:

Love actually scene 55

That graph is based on all but the last scene in the movie (where most of the characters come together in the airport, which makes for a less interesting cluster diagram). Until that point, Billy and Joe's story takes place independently of all the other characters, while five other characters are connected to the rest by just one scene (Mark and Sarah's conversation at a wedding). David even created an interactive Shiny app (from where I grabbed the chart above) that allows you to step through the movie scene by scene and watch the connections develop as the movie unfolds.

The network analysis begind the chart and the app was done entirely in the R language. David began by parsing the text of the movie script, which yields a data file of each character's lines labelled by scene number. From there, he created a co-occurence matrix counting the number of times each pair of characters shared a scene, from which it was a simple process to generate the network diagram using the igraph package. David helpfully provided the R code, so if you have another movie script at hand, it should be easy to adapt. You can learn more about the details of the analysis in David's blog post, linked below.

Variance Explained: Analyzing networks of characters in 'Love Actually''

To leave a comment for the author, please follow the link and comment on their blog: Revolutions. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)