Solving #AdventOfCode day 3 and 4 with R

December 3, 2018
By

(This article was first published on Colin Fay, and kindly contributed to R-bloggers)

Solving the puzzles of Advent of Code with
R.

[Disclaimer] Obviously, this post contains a big spoiler about Advent
of Code, as it gives solutions for solving day 1 and 2.

Advent of Code

Advent of Code is an Advent calendar of small programming puzzles for
a variety of skill sets and skill levels that can be solved in any
programming language you like. About Advent of
Code

Day three, part one

The first part of the challenge for day 3 starts well: we have to read
in R this
file:

readLines("input3.txt", n = 6)
## [1] "#1 @ 483,830: 24x18" "#2 @ 370,498: 21x17" "#3 @ 403,823: 25x21"
## [4] "#4 @ 619,976: 20x15" "#5 @ 123,385: 15x26" "#6 @ 484,592: 11x19"

It’s a description of “spots”, claimed by the elves, inside a big square
piece of fabric (supposed to be for making Santa’s clothes (remember
we’re coding with Christmas theme 😉 ).

It reads:

"#1 @ 483,830: 24x18" 
Id / distance from left / distance from top / width / height 

In other words, the “claim” with ID one is for a spot that starts at 483
inches from the left of the fabric and at 830 inches from the top. This
piece is 24 inches wide and 18 inches tall.

How many square inches of fabric are within two or more claims?

Our challenge is to determine how many squares of fabric are claimed
more than once. In over words, if we take all the squares from all the
elves, how many square inches are claimed by more than one elf?

Let’s start by reading the file:

library(tidyverse)
df <- read.delim("input3.txt", header = FALSE) %>%
  as.tibble() %>%
  print() %>%
  # Clean the column 
  mutate(V1 = str_replace_all(V1, "[#@,:x]", " "),
         V1 = str_replace_all(V1, " {2,}", " "), 
         V1 = str_replace_all(V1, "^ ", "")) %>%
  # Separate the column
  separate(V1, into = c("id", "left", "top", "wide", "tall")) %>% 
  # Make sure everything is numeric
  mutate_all(as.numeric) %>%
  # Create the right and down element
  mutate(
    right = left + wide, 
    down = top + tall 
  ) 
## # A tibble: 1,375 x 1
##    V1                  
##                   
##  1 #1 @ 483,830: 24x18 
##  2 #2 @ 370,498: 21x17 
##  3 #3 @ 403,823: 25x21 
##  4 #4 @ 619,976: 20x15 
##  5 #5 @ 123,385: 15x26 
##  6 #6 @ 484,592: 11x19 
##  7 #7 @ 394,960: 28x14 
##  8 #8 @ 730,592: 26x20 
##  9 #9 @ 975,963: 16x26 
## 10 #10 @ 452,496: 18x18
## # ... with 1,365 more rows
cat("Cleaned version:\n")
## Cleaned version:
df
## # A tibble: 1,375 x 7
##       id  left   top  wide  tall right  down
##          
##  1     1   483   830    24    18   507   848
##  2     2   370   498    21    17   391   515
##  3     3   403   823    25    21   428   844
##  4     4   619   976    20    15   639   991
##  5     5   123   385    15    26   138   411
##  6     6   484   592    11    19   495   611
##  7     7   394   960    28    14   422   974
##  8     8   730   592    26    20   756   612
##  9     9   975   963    16    26   991   989
## 10    10   452   496    18    18   470   514
## # ... with 1,365 more rows

So now that we have that, how can we get all the squares claimed
more than once? To do this, we’ll go row-wise on this data.frame. With
each row, we’ll start by create vectors which correspond to all the
claimed squares. To do that, we’ll create two vectors: row and columns,
use expand.grid to create all the possible combination (i.e all the
locations on the fabric) for this claim.

For example, imagine a claim that goes from square 1 to 2 and from line
3 to 4, we’ll create a vector 1:2 and 3:4, and when combining that we’ll
have the id for all claimed squares: 1×3, 1×4, 2×3 and 2×4. We’ll do
this for all the claims, then count the number of occurrence for each
newly created id.

res <- map_df(
  1:nrow(df), 
  function(x){
    res <- list(
     col = (df$left[x] + 1):(df$left[x]+df$wide[x]),
     row = (df$top[x] + 1):(df$top[x]+df$tall[x])
    )
    expand.grid(res$col, res$row) %>%
      mutate(claim_id = df$id[x]) %>%
      unite(col = "idx", c("Var1", "Var2"), sep = "x")
  }
) %>% as.tibble()
res
## # A tibble: 509,129 x 2
##    idx     claim_id
##          
##  1 484x831        1
##  2 485x831        1
##  3 486x831        1
##  4 487x831        1
##  5 488x831        1
##  6 489x831        1
##  7 490x831        1
##  8 491x831        1
##  9 492x831        1
## 10 493x831        1
## # ... with 509,119 more rows
res %>% 
  count(idx) %>%
  filter(n > 1) %>%
  count()  
## # A tibble: 1 x 1
##       nn
##    
## 1 119551

Day three, part two

What is the ID of the only claim that doesn’t overlap?

The second part is a little bit different: we need to identify the claim
which doesn’t overlap with any other claim. In other words, the claim
that contains only square inches which are claimed once.

So how to do that ?

We’ll start from our previous table and count the number of square by
claim:

n_of_claims_by_id <- res %>% 
  count(claim_id)
n_of_claims_by_id
## # A tibble: 1,375 x 2
##    claim_id     n
##        
##  1        1   432
##  2        2   357
##  3        3   525
##  4        4   300
##  5        5   390
##  6        6   209
##  7        7   392
##  8        8   520
##  9        9   416
## 10       10   324
## # ... with 1,365 more rows

Now we’ll count the number of time each square is claimed:

n_of_claims_by_inch <- res %>%
  count(idx)
n_of_claims_by_inch
## # A tibble: 350,158 x 2
##    idx          n
##        
##  1 1000x793     1
##  2 1000x794     1
##  3 1000x795     1
##  4 1000x796     1
##  5 1000x797     1
##  6 1000x798     1
##  7 1000x799     1
##  8 1000x800     1
##  9 1000x801     1
## 10 1000x802     1
## # ... with 350,148 more rows

Let’s add this info to our result object:

res <- res %>%
  left_join(n_of_claims_by_inch, by = "idx") 
res
## # A tibble: 509,129 x 3
##    idx     claim_id     n
##           
##  1 484x831        1     3
##  2 485x831        1     3
##  3 486x831        1     2
##  4 487x831        1     2
##  5 488x831        1     2
##  6 489x831        1     2
##  7 490x831        1     2
##  8 491x831        1     2
##  9 492x831        1     1
## 10 493x831        1     1
## # ... with 509,119 more rows

Here, we can read that square at 484x831 has been claimed three times
(n), and notably by the ID #1. So let’s filter on the square that
has only been claimed once:

res <- res %>%
  filter(n == 1)
res
## # A tibble: 230,607 x 3
##    idx     claim_id     n
##           
##  1 492x831        1     1
##  2 493x831        1     1
##  3 494x831        1     1
##  4 495x831        1     1
##  5 496x831        1     1
##  6 497x831        1     1
##  7 498x831        1     1
##  8 499x831        1     1
##  9 500x831        1     1
## 10 501x831        1     1
## # ... with 230,597 more rows

Now for the final trick: a claim with no overlap would have, in our
result, the same number of occurrence in the claim_id column from
res and in the idx from n_of_claims_by_inch

res %>%
  count(claim_id) %>%
  left_join(n_of_claims_by_id, by = "claim_id") %>%
  filter(nn == n)
## # A tibble: 1 x 3
##   claim_id    nn     n
##        
## 1     1124   299   299

So here it is: the claim_id 1124 is the one we are looking for!

Day four, part one

This one was also fun to do: we have another weird data file, that we
need to parse. This data is said to record the date and time, all days
between 00 and 00:59 at a manufacturing lab, when the guard is awake of
asleep.

library(lubridate)
## 
## Attaching package: 'lubridate'

## The following object is masked from 'package:base':
## 
##     date
day4 <- read.delim("input4.txt", header = FALSE) %>% as.tibble()
day4
## # A tibble: 1,129 x 1
##    V1                                         
##                                          
##  1 [1518-08-21 00:39] wakes up                
##  2 [1518-08-11 00:56] wakes up                
##  3 [1518-10-10 23:52] Guard #2707 begins shift
##  4 [1518-10-14 00:55] wakes up                
##  5 [1518-07-19 00:42] wakes up                
##  6 [1518-06-16 00:41] falls asleep            
##  7 [1518-03-24 00:57] wakes up                
##  8 [1518-05-06 00:39] wakes up                
##  9 [1518-08-27 00:00] Guard #3323 begins shift
## 10 [1518-03-22 23:50] Guard #2753 begins shift
## # ... with 1,119 more rows
time_minute <- day4 %>%
  # Create a hour / event column
  separate(V1, into = c("hour", "event"), sep = "] ") %>% 
  mutate(hour = gsub("\\[", "", hour), 
         hour = ymd_hm(hour)) %>%
  # Order the hour column
  arrange(hour) %>%
  # Create a column with the guard ID id present
  mutate(guard_id = str_extract(event, "#([0-9]*)"), 
         guard_id = str_replace(guard_id, "#", ""), 
         no_id = is.na(guard_id), 
         record = 1:n()
         ) %>%
  # Add the guard ID everywhere
  fill(guard_id) %>%
  # Filter to keep only the event rows
  filter(no_id) %>% 
  select(-no_id) %>%
  # Spread for event and hour
  spread(event, hour) %>%
  # Fill the wakes up column so that there is no NA
  fill(`wakes up`, .direction = "up") %>%
  select(-record) %>%
  # Remove the line with NA in the fall asleep column
  drop_na(`falls asleep`) %>%
  # Get one minutes less, as the wakes up minute is considered to be an
  # awaken minute
  mutate(`wakes up` = `wakes up` - minutes(1),
  # Create a list column with the sequence of minutes the guard is asleep
         minutes_asleep = map2(`falls asleep`, `wakes up`, 
                               seq, by = "min"),
  # Compute how many time the guard is asleep
         sleepy_time = `wakes up` - `falls asleep`) 
time_minute
## # A tibble: 420 x 5
##    guard_id `falls asleep`      `wakes up`          minutes_asleep
##                                            
##  1 1213     1518-02-16 00:08:00 1518-02-16 00:22:00    
##  2 1213     1518-02-16 00:44:00 1518-02-16 00:50:00     
##  3 1213     1518-02-16 00:56:00 1518-02-16 00:58:00     
##  4 1213     1518-02-18 00:13:00 1518-02-18 00:24:00    
##  5 1213     1518-03-29 00:02:00 1518-03-29 00:10:00     
##  6 1213     1518-03-29 00:16:00 1518-03-29 00:47:00    
##  7 1213     1518-04-27 00:03:00 1518-04-27 00:19:00    
##  8 1213     1518-04-27 00:24:00 1518-04-27 00:41:00    
##  9 1213     1518-04-27 00:47:00 1518-04-27 00:58:00    
## 10 1213     1518-05-06 00:02:00 1518-05-06 00:38:00    
## # ... with 410 more rows, and 1 more variable: sleepy_time 

Find the guard that has the most minutes asleep. What minute does that
guard spend asleep the most?

So, our first task is to know which guard is asleep the most and on
which minute of the hour. Let’s first compute who this guard is:

which_guard <- time_minute %>%
  group_by(guard_id) %>%
  summarise(s = sum(sleepy_time)) %>%
  top_n(1, s) 

Now we know who he is, let’s see at which minute of the hour he’s the
most asleep:

time_minute %>%
  filter(guard_id == which_guard$guard_id) %>%
  unnest(minutes_asleep) %>%
  print() %>%
  # Extract the minute part
  mutate(minutes_asleep = minute(minutes_asleep)) %>%
  count(minutes_asleep, sort = TRUE)%>%
  top_n(1)
## # A tibble: 502 x 5
##    guard_id `falls asleep`      `wakes up`          sleepy_time
##                                    

Day four, part one

The second part was quite straightforward once you’ve got the previous
answer. > Of all guards, which guard is most frequently asleep on the
same minute?

time_minute %>%
  unnest(minutes_asleep) %>%
  mutate(minutes_asleep = minute(minutes_asleep)) %>%
  count(guard_id, minutes_asleep, sort = TRUE) %>% 
  top_n(1)
## Selecting by n

## # A tibble: 1 x 3
##   guard_id minutes_asleep     n
##                 
## 1 1213                 19    16

To leave a comment for the author, please follow the link and comment on their blog: Colin Fay.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)