Creating an animation using R

[This article was first published on DataScience+, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this post, I will show you how to create an animation using R and ffmpeg. The idea to do so is pretty simple:

  • Generate a number of snapshots
  • Combine them in a video file using ffmpeg

The best way to learn about the art of animation is by doing it ourselves, so our work example is the infamous Broken Stick Problem.

Here is the scenario:
Let there be a stick of length 1. Pick two points uniformly at random along the stick and break the stick at those points. What is the probability of the three resulting pieces being able to form a triangle? In the first part of this post, I will provide my analytical approach to calculate the probability. Then in the second part, I will use Monte Carlo simulation to reach to the answer.

Analytical Solution

The triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than the length of the remaining side. Let’s suppose that \(X_2 > X_1\), the vertices of our triangle could have the following lengths: \(X_1\text{; } X_2 – X_1 \text{ and } 1 – X_2\). For the three pieces to form a triangle, none of them should have a length less than half. In order words the following conditions must be satisfied: (a) \(X_1 < \dfrac{1}{2}\) (b) \(X_2 – X_1 < \dfrac{1}{2}\) (c) \(1 – X_2 < \dfrac{1}{2}\) (d) Sum of the 3 vertices is equal to 1.

The above conditions could also be established for the case where \(X_1 > X_2\). The diagram below displays the above conditions. Our areas of interest are derived by finding the intersection where conditions (a)-(d) are valid, hence giving us the probability of obtaining a triangle under such conditions.

Solution Triangle

Looking at the two areas (red and blue) above, the probability of obtaining a triangle is:

\(P(triangle) = Area(X_1 > X_2) + Area(X_2 > X_1) = 1/8 + 1/8 = 1/4\)

Simulated Solution

The estimation of the probability is done using a Monte Carlo algorithm. Here is my proposed approach to solve this problem:

  • Generate 2 random points from a uniform distribution between 0 & 1
  • Check whether the 2 two values obtained satisfy conditions (a)-(d); if yes give 1 to that pair otherwise give 0
  • Compute the cumulative empirical probability
  • Repeat the above steps multiple times (in this case 1000 times)

The R-code is as follows:

suppressPackageStartupMessages(require(ggplot2))
set.seed(2016)
index_Score <- function(){
  # Picking 2 points randomly on the stick at the same time
  x <- runif(n = 2, min = 0, max = 1) 
  a <- min(x) # first point
  b <- max(x) # second point
  # pieces of the stick with their respective length
  pieces <- c(a, b-a, 1-b)
  cond1 <- sum(pieces[c(1,2)]) > pieces[3] # condition # 1
  cond2 <- sum(pieces[c(1,3)]) > pieces[2] # condition # 2
  cond3 <- sum(pieces[c(3,2)]) > pieces[1] # condition # 3
  combine_conds <- ifelse(cond1 & cond2 & cond3, 1, 0) # if all 3 conditions are satisfied
  return(combine_conds)
}

cnt <- c()
total <- 1000
for(k in 1:total) cnt = c(cnt, index_Score())
df <- setNames(data.frame(1:total, rep(0, total)), c("Incrmt","Probs"))
for (i in 1:total)  df$Probs[i] <- sum(cnt[1:i])/i

The final step consists of generating the snapshots (1000 of them) depicting the iterations and the cumulative probabilities and encoding these images into a video format.

for(i in 1:total) {
  sub_df <- subset(df, df$Incrmt <= i)
  simul_plot <- qplot(Incrmt, Probs, data = sub_df, geom = "path") + 
    labs(x = "iterations", y = "Probabilities", title = "Monte Carlo Simulation") + ylim(c(0,0.4)) + 
    geom_hline(yintercept = 0.25, colour = "red", linetype = "longdash")
  ggsave(plot = simul_plot, filename = paste(sprintf("images/brokenstick_%02d",i),".png", sep = ""), limitsize = FALSE)
  rm(sub_df)
  dev.off()
}

ffmpeg is quite a nice tool which can be installed in Linux, Windows and Mac OS. The following command line in a terminal shell produces a video file in the mpeg format

ffmpeg -r 10 -i broken_stick_%02d.png -b:v 20M BrokenStick_video.mp4

-r 10 controls the rate of frames per seconds (10 fps here) and -b:v 20M sets the bitrate in the output file.

The result is the following video

And voila! Done.

When it comes to rendering great animations either to understand the dynamic behind data or just to impress your manager and /or colleagues, R provides multiples options to do so. The best R-package for animations that I would recommend is animation which provides functions to save animations in Flash, GIF, HTML pages, PDFs and videos.
If you have enjoyed this post, please drop me some comments/suggestions.

P.S.: the background music is by Nas – The World Is Yours (Instrumental) produced by Pete Rock

    Related Post

    1. The importance of Data Visualization
    2. ggplot2 themes examples
    3. Map the Life Expectancy in United States with data from Wikipedia
    4. What can we learn from the statistics of the EURO 2016 – Application of factor analysis
    5. Visualizing obesity across United States by using data from Wikipedia

    To leave a comment for the author, please follow the link and comment on their blog: DataScience+.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

    Never miss an update!
    Subscribe to R-bloggers to receive
    e-mails with the latest R posts.
    (You will not see this message again.)

    Click here to close (This popup will not appear again)