[This article was first published on r-bloggers – Lakers Box Score Breakdown, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Kareem Abdul-Jabbar has sat atop the NBA’s leaderboard of career regular season scoring since taking the top spot from Wilt Chamberlain in 1984. LeBron James, who currently sits at #3, is the only active player currently in the top 10, and likely needs three more healthy seasons to surpass Kareem.

Bar chart races have become a somewhat controversial data visualization, with detractors decrying them as information overload. But one thing the haters can’t deny is that these charts are attention-grabbing, even captivating. Here’s how to make one using R.

The data needed to create the bar chart race can be found in this Google Sheet. Start by loading the necessary packages and reading in the data (I am using a csv saved locally with the same data that’s in the Google Sheet referenced above).

```library(dplyr)
library(ggplot2)
library(gganimate)

The `dplyr` and `ggplot2` packages should be familiar to most R users. The third package, `gganimate`, is what is used to stitch together several static plots created with `ggplot2` and turn them into an animated plot. Let’s start with how to create each individual static plot.

#### Creating a Static Plot

I’ll walk through a few intermediate steps before showing the more polished version of the chart to demonstrate how `ggplot` allows you to build plots iteratively. We can start by filtering for just one year of data and plotting the top 10 scorers. That can be accomplished using the code below:

```chart_data %>%
filter(YearEnd == 2020) %>%
ggplot(aes(x = -Rank, y = CareerPts, fill = Player)) +
geom_tile(aes(y = CareerPts / 2, height = CareerPts),
width = 0.9) +
coord_flip()```

This basic plot uses `geom_tile` rather than `geom_bar`, which works better with the animation we will eventually be using. The way `geom_tile` works is that you specify the center of the tile (i.e. the midpoint of the rectangle, which is the height divided by two, hence `CareerPts / 2`) as well as the width and height. The call to `coord_flip` gives us horizontal bars rather than vertical bars (also note the x and y-axes are now flipped). The reason for specifying `-Rank` as the x aesthetic mapping is so that we get the top-ranking player at the top of the chart rather than the bottom.

Next we’ll add the labels for the player names and point totals onto the bars. That can be accomplished with the following code:

```  chart_data %>%
filter(YearEnd == 2020) %>%
ggplot(aes(x = -Rank, y = CareerPts, fill = Player)) +
geom_tile(aes(y = CareerPts / 2, height = CareerPts),
width = 0.9) +
coord_flip() +
# Add player labels to bars
geom_text(aes(label = Player), col = "white",
hjust = "right", nudge_y = -1000) +
# Add point totals next to bars
geom_text(aes(label = scales::comma(CareerPts, accuracy = 1)),
hjust = "left", nudge_y = 1000)```

The code above adds two calls to `geom_text`, the first of which adds the player labels in white, with the latter adding the point totals. These both inherit their x and y aesthetics from the original call to `ggplot`, which sets their position at the tip of the bars. The `hjust` argument makes the player labels right-justified and the point labels left-justified. The `nudge_y` argument offsets the player labels -1000 along the y-axis (remember our coordinates are flipped, so this is now a horizontal shift), and the point labels +1000. The call to `scales::comma` is for formatting the points labels.

#### Final Formatting

```  chart_data %>%
filter(YearEnd == 2020) %>%
ggplot(aes(x = -Rank, y = CareerPts, fill = Player)) +
geom_tile(aes(y = CareerPts / 2, height = CareerPts),
width = 0.9) +
geom_text(aes(label = Player), col = "white",
hjust = "right", nudge_y = -1000) +
geom_text(aes(label = scales::comma(CareerPts, accuracy = 1)),
hjust = "left", nudge_y = 1000) +
# Final formatting
coord_flip(clip = "off", expand = FALSE) +
ylab("Career Points") +
scale_x_discrete("") +
scale_y_continuous(limits = c(-4000, 49000),
labels = scales::comma) +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5, size = 20),
legend.position = "none",
panel.grid.minor = element_line(linetype = "dashed"),
panel.grid.major = element_line(linetype = "dashed"))
```

For the final formatting steps, we add the `clip = "off"` argument to `coord_flip`, which prevents the point labels from getting cut off as in the previous chart. The `expand = FALSE` argument prevents the chart from expanding beyond the specified x and y-limits. A title is added along with axis labels, with the x-axis (vertical) being set to blank with `scale_x_discrete`. The y-axis limits are set using `scale_y_continuous` and labels are given some nicer formatting using `scales::comma`. The final touches are added with `theme_minimal`, which removes the gray chart background, and additional `theme` elements to center the plot title, remove the legend, and use dashed gridlines.

#### Create Multiple Plots

Now that we have one polished plot created, we need to reproduce that across several years. You can create a visual of this across a few years using `facet_wrap`.

```chart_data %>%
filter(YearEnd >= 2018) %>%
ggplot(aes(x = -Rank, y = CareerPts, fill = Player)) +
...
+ facet_wrap(~YearEnd) ```

Updating the `filter(YearEnd == 2020)` in the previous code to `YearEnd >= 2018` and adding `+ facet_wrap(~YearEnd)` to the end of that same code produces the following:

You can see that the only difference since 2018 is LeBron James moving from #7 in 2018 to #4 in 2019 and #3 in 2020. These plots are the building blocks for the animation. Once these are all set up, it’s time to bring in the `gganimate` functions.

Now we want to stitch together the plots created in the previous section and animate them using `gganimate`. We replace the `facet_wrap` function with `transition_time(YearEnd)`. Let’s also update the `filter` to go back to 2010 to see how this works across a short but meaningful period of time.

```chart_data %>%
filter(YearEnd >= 2010) %>%
ggplot(aes(x = -Rank, y = CareerPts, fill = Player)) +
...
+ transition_time(YearEnd) +
labs(subtitle = "Top 10 Scorers as of {round(frame_time, 0)}") +
theme(plot.subtitle = element_text(hjust = 0.5, size = 12))
```

The resulting animation should show Kobe Bryant, Dirk Nowitzki, and LeBron James moving up the rankings. A subtitle is also added, which references the `frame_time`, a handy property that you can access when using `gganimate` (try it without the `round` function wrapped to see how `gganimate` iterates through individual frames).

#### Putting it all together

If everything has worked up to this point, the final steps are to use the full data set, and set some animation parameters so that you can save it in a nice format.

```anim <- chart_data %>%
# Comment out the filter
# filter(YearEnd >= 2010) %>%
ggplot(aes(x = -Rank, y = CareerPts, fill = Player)) +
...
+ transition_time(YearEnd) +
labs(subtitle = "Top 10 Scorers as of {round(frame_time, 0)}") +
theme(plot.subtitle = element_text(hjust = 0.5, size = 12))

animate(anim, renderer = gifski_renderer(),
end_pause = 50,
nframes = 5*(2020-1950), fps = 10,
width = 1080, height = 720, res = 150)

```

The `animate` function allows you to specify the details about the animation. The default `renderer` is the `gifski_renderer`, but you can also choose others like `av_renderer` or `ffmpeg_renderer` if you wanted to save a video instead of a gif. The `end_pause` parameter lets you have a nice pause at the end of the animation so that the gif doesn’t cycle back to the beginning right away. You set the number of frames and frames per second with `nframes` and `fps` respectively (you may need to tweak these arguments depending on how fast or slow you want the animation). The `width`, `height`, and `res` arguments let you specify device dimensions and resolution, which will determine the size and resolution of the gif in this case. Finally, the call to `anim_save` is how you save the animation to a file.

One footnote: I also had a mapping of team colors to make the color scheme a little more meaningful, which I declined to include in this walkthrough (that’s why the colors are different in the gif at the beginning of this post). When all’s said and done, you should have something like this:

Data for these charts was from basketball-reference.com. This is hopefully my first of many posts for R-bloggers.