Where the Beer is

[This article was first published on R – NYC Data Science Academy Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Since prohibition ended in 1933, the American brewing industry has undergone two massive transformations. The first saw hundreds of regional breweries from across the country, often brewing beer unique their respective regions, become consolidated into a handful of behemoths. During the period of greatest consolidation in the early 1980s the ten largest breweries produced almost 95% of American beer. This naturally led to fewer options and, in the opinion of many, lower quality beer. Fortunately for beer lovers, a renaissance has since swept the American brewing industry. There are now thousands of craft breweries in all fifty states that offer a wide range of styles and are constantly experimenting (Banana Kölsch anyone?).

As someone who enjoys visiting small breweries, I was interested by how this transformation had unfolded both through time and by state. I was able to find data collected by the Alcohol and Tobacco Tax and Trade Bureau that tracked both the number of brewery licenses and the amount of beer produced over the last few decades.  With R and Shiny, I built a dashboard to help visualize this data and hopefully inspire users to support their local brewer.

In addition to the data from the Alcohol and Tobacco Tax and Trade Bureau, I drew upon state population data (used to calculate ‘per resident’ statistics) from the Federal Reserve Bank of St. Louis. Lastly, the ‘Mean ABV’ and ‘Mean Rating’ data comes from a Kaggle.com data set that was originally scraped from BeerAdvocate. Where as the other data comes from highly reputable sources, conclusions based on this data should be taken with a grain of salt. There may be unknown biases, both innocent and nefarious, within the data.

When exploring the data, it is important to consider how the presence of large brewing facilities affects some of the metrics. For example, in 2018 over 90% of beer brewed in Oklahoma was consumed on brewery premises. Whereas less than 1% of corresponding beer was consumed this way in neighboring Texas. This is undoubtably due to the presence of large Anheuser-Busch production facilities in the state which produce high volumes of beer for national sale and dwarf production at small breweries with onsite tap rooms.

I especially enjoyed seeing the explosion of craft breweries reflected in the data. While the Alcohol and Tobacco Tax and Trade Bureau does not collect this data explicitly, various metrics such as ‘barrels of beer consumed on a brewery premise’ serve as good proxies. It is also interesting look at trends in specific states and investigate how they correspond with regulatory changes.


Click here to view the shiny app and explore the data. Or click here to visit my GitHub and check out the code. 


To leave a comment for the author, please follow the link and comment on their blog: R – NYC Data Science Academy Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)