Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# 141–142

Puzzles:

PQ_141: content file
PQ_142: content file

Last time I noted that usually PQ Puzzles are focused on times and dates. And… it confirms this week. Lets count it.

PQ_141

Rolling averages and other rolling calculations are very useful concepts because it can show us main trend cleaned from lower time granulation anomalies. In R we can play with comparison of dates to dates with lagged measurements. It is not hard but can complex task. Every complex task has much bigger chance for mistakes. Thats why I use dedicated package for out today puzzle. We have measurements for months and all we need is to calculate mean average from last 3 and 5 months.

```library(tidyverse)
library(slider)

input = read_excel("Power Query/PQ_Challenge_141.xlsx", range = "A1:C35")
test  = read_excel("Power Query/PQ_Challenge_141.xlsx", range = "E1:I35")```

Transformation

```result = input %>%
group_by(Month) %>%
mutate(
`3 Year MV` = slide_dbl(Defects, mean,
.after = -1,
.before = 3,
.complete = TRUE) %>% round(0),
`5 Year MV` = slide_dbl(Defects, mean,
.after = -1,
.before = 5,
.complete = TRUE) %>% round(0)
) %>%
ungroup()```

Validation

```identical(result, test)
#> [1] TRUE```

PQ_142

In this puzzle we calculate something that HR called “head count”, which means number of people hired or working in certain period of time. In our task we have to create head count for each quarter of hour from 9AM to 9PM, having start and end time for 3 people. It can be tricky in some ways but I will explain it later.

```library(tidyverse)
library(lubridate)

input = read_excel("Power Query/PQ_Challenge_142.xlsx", range = "A1:C4")
%>% janitor::clean_names()
test = read_excel("Power Query/PQ_Challenge_142.xlsx", range = "E1:F49")```

Transformation

```input <- input %>%
mutate(interval = interval(ymd_hms(start_time), ymd_hms(end_time)))

quarter_table <- tibble(
interval = interval(
seq(ymd_hms("1899-12-31 09:00:00"),
ymd_hms("1899-12-31 20:45:00"),
by = "15 mins"),
seq(ymd_hms("1899-12-31 09:14:59"),
ymd_hms("1899-12-31 20:59:59"),
by = "15 mins")
)
)
mutate(
Count = map_dbl(interval, ~sum(int_overlaps(.x, input\$interval))),
Time = paste(format(int_start(interval), "%I:%M:%S %p"),
format(int_end(interval), "%I:%M:%S %p"),
sep = " - ")
) %>%
select(Time, Count)```

Validation

You can validate it by your eyes, because there is one issue. If we want to have disjoint time intervals we need to finish previous one second before the next started. But if we do it that way we have problem with person who ends its job at 19:00:00 because it would look like working only one second in next interval. That is why I cannot validate it with given results, but except one line values are exactly the same.

Thanks for your engagement, and let me know if you have any comments. Stay tuned, because this Thursday article about memoization will be online.

PowerQuery Puzzle solved with R was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.