Tidyer BLS data with the blscarpeR package

[This article was first published on Data Science Riot!, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The recent release of the blscrapeR package brings the “tidyverse” into the fold. Inspired by my recent collaboration with Kyle Walker on his excellent tidycensus package, blscrapeR has been optimized for use within the tidyverse as of the current version 3.0.0.

New things you’ll notice right away include:

  • All data now returned as tibbles.

  • dplyr and purrr are now imported packages, along with magrittr and ggplot, which were imported from the start.

  • No need to call any packages other than tidyverse and blscrapeR.

Major internal changes

  • Switched from base R to dplyr in instances where performance could be increased.

  • Standard apply functions replaced with purrr map() functions where performance could be increased.


The BLS: More than Unemployment

The American Time Use Survey is one of the BLS’ more interesting data sets. Below is an API query that compares the time Americans spend watching TV on a daily basis compared to the time spent socializing and communicating.

It should be noted, some familiarity with BLS series id numbers is required here. The BLS Data Finder is a nice tool to find series id numbers.

tbl <- bls_api(c("TUU10101AA01014236", "TUU10101AA01013951")) %>%
    spread(seriesID, value) %>%
    dateCast() %>%
    rename(watching_tv = TUU10101AA01014236, socializing_communicating = TUU10101AA01013951)
## # A tibble: 3 x 7
##    year    period periodName footnotes socializing_communicating watching_tv       date
## *                                               
## 1  2014                          0.71        2.82 2014-01-01
## 2  2015                          0.68        2.78 2015-01-01
## 3  2016                          0.65        2.73 2016-01-01

Unemployment Rates

The main attraction of the BLS are the monthly employment and unemployment data. Below is an API query and plot of three of the major BLS unemployment rates.

  • U-3: The “official unemployment rate.” Total unemployed, as a percent of the civilian labor force.

  • U-5: Total unemployed, plus discouraged workers, plus all other marginally attached workers, as a percent of the civilian labor force plus all marginally attached workers.

  • U-6: Total unemployed, plus all marginally attached workers, plus total employed part time for economic reasons, as a percent of the civilian labor force plus all marginally attached workers.

tbl <- bls_api(c("LNS14000000", "LNS13327708", "LNS13327709"), registrationKey = "BLS_KEY") %>%
    spread(seriesID, value) %>%
    dateCast() %>%
    rename(u3_unemployment = LNS14000000, u5_unemployment = LNS13327708, u6_unemployment = LNS13327709)

ggplot(data = tbl, aes(x = date)) + 
    geom_line(aes(y = u3_unemployment, color = "U-3 Unemployment")) +
    geom_line(aes(y = u5_unemployment, color = "U-5 Unemployment")) + 
    geom_line(aes(y = u6_unemployment, color = "U-6 Unemployment")) + 
    labs(title = "Monthly Unemployment Rates") + ylab("value") +
    theme(legend.position="top", plot.title = element_text(hjust = 0.5)) 

For more information and examples, please see the package vignettes.

To leave a comment for the author, please follow the link and comment on their blog: Data Science Riot!.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)