PowerQuery Puzzle solved with R

[This article was first published on Numbers around us - Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# 139–140


PQ_139: content file
PQ_140: content file

I noticed that usually Power Query puzzles are about time. Does it really a case in PQ that time related calculations are hard or tricky?


In this puzzle we have to extract informations about people days off for vacation. Table has person and date when that person was on day of. Our goal is to find start and end date of break but with one important factor. We do not count weekends.

Lets check it out.

Load data and libraries


input = read_excel("PQ_Challenge_139.xlsx", range = "A1:B20") %>% janitor::clean_names()
test  = read_excel("PQ_Challenge_139.xlsx", range = "D1:H7") %>% janitor::clean_names()


result = input %>%
  group_by(name) %>% 
  mutate(lagged = lag(vacation_date, 1),
         diff = as.numeric(difftime(vacation_date, lagged, units = "days")),
         diff = ifelse(is.na(diff), 0, diff),
         vacation_no = cumsum(ifelse(diff != 1, 1, 0)),
         wd = wday(vacation_date, week_start = 1)) %>%
  ungroup() %>%
  filter(!wd %in% c(6, 7))  %>%
  group_by(name, vacation_no) %>%
  summarise(vacation_from_date = min(vacation_date),
            vacation_end_date = max(vacation_date),
            number_of_workdays = as.numeric(n())) %>%
  ungroup() %>%


identical(test, result)
# [1] TRUE


And once again we have warehouse management task. We have opening stock level for first day, and withdrawals with dates and amounts. But manager needs to have values for end of a day.

Let help him out!

Load data and libraries


T1 = read_excel("PQ_Challenge_140.xlsx", range = "A1:D10") %>% janitor::clean_names()
T2 = read_excel("PQ_Challenge_140.xlsx", range = "F1:G7") %>% janitor::clean_names()

test = read_excel("PQ_Challenge_140.xlsx", range = "I1:M10") %>% janitor::clean_names()


combined_data = T1 %>%
  left_join(T2, by = "item")%>%
  arrange(item, date, time) %>%
# when I read data date for time column is in 1890s and I need to fix it
  mutate(time1 = date + hours(hour(time))+ minutes(minute(time)) + seconds(second(time))) %>%
  group_by(item) %>%
  mutate(cum_quantity = cumsum(quantity),
         cum_stock = stock - cum_quantity) %>%
  ungroup() %>%
  group_by(item, date) %>%
  mutate(end_of_day = max(time1) == time1, 
         end_of_day_stock = ifelse(end_of_day, cum_stock, NA)) %>%
  ungroup() %>%


identical(combined_data, test)
# [1] TRUE

We’ve seen some nice puzzles considering time changes and it shows little bit out of the box thinking. Feel free to ask, like and share.
Lets be in touch for article about the memoization.

PowerQuery Puzzle solved with R was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.

To leave a comment for the author, please follow the link and comment on their blog: Numbers around us - Medium.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)