Solving #AdventOfCode day 5 and 6 with R

December 3, 2018
By

(This article was first published on Colin Fay, and kindly contributed to R-bloggers)

Solving the puzzles of Advent of Code with
R.

[Disclaimer] Obviously, this post contains a big spoiler about Advent
of Code, as it gives solutions for solving days quoted in the title.

Advent of Code

Advent of Code is an Advent calendar of small programming puzzles for
a variety of skill sets and skill levels that can be solved in any
programming language you like. About Advent of
Code

Day five, part one

The first part of the challenge for day 3 starts well: we have to read
in R a file with A LOT of letters 🙂

vec <- readLines("input5.txt")
substr(vec, 1, 30)
## [1] "MmMmMxcRrCmMAaEeiIMmopZzPHxEeH"

This character vector is a polymer, and it describes chemical reactions.

The polymer is formed by smaller units which, when triggered, react
with each other such that two adjacent units of the same type and
opposite polarity are destroyed. Units’ types are represented by
letters; units’ polarity is represented by capitalization. For
instance, r and R are units with the same type but opposite polarity,
whereas r and s are entirely different types and do not react.

So the first question was:

How many units remain after fully reacting the polymer you scanned?

Let’s use a regex to do that 🙂 First, we’ll create a regex that
describes any possible combination of upper/lower
letters:

regex <- paste0(paste0(letters, LETTERS), "|", paste0(LETTERS, letters), collapse = "|")
regex
## [1] "aA|Aa|bB|Bb|cC|Cc|dD|Dd|eE|Ee|fF|Ff|gG|Gg|hH|Hh|iI|Ii|jJ|Jj|kK|Kk|lL|Ll|mM|Mm|nN|Nn|oO|Oo|pP|Pp|qQ|Qq|rR|Rr|sS|Ss|tT|Tt|uU|Uu|vV|Vv|wW|Ww|xX|Xx|yY|Yy|zZ|Zz"
# Check that the regex is there
grepl(regex, vec)
## [1] TRUE

So, we’ll need to run a loop that removes this regex until there is
nothing to remove anymore. The idea is to get the number of characters
before the gsub, perform the sub, and when the removal is done,
count if the number of characters from before the subtraction is
different from the number of characters after (i.e, if it is the same,
there is nothing to gsub anymore).

continue <- TRUE
while (continue) {
  old_size <- nchar(vec)
  vec <- gsub(regex, "", vec) 
  continue <- nchar(vec) != old_size
}

# Check that the regex is not there anymore
grepl(regex, vec)
## [1] FALSE
nchar(vec)
## [1] 9288

Day five, part two

With part 2, we need to try to first remove, one by one, the couples of
units (for example “aA”), and react the polymer without each couple.
Then, we need to find:

What is the length of the shortest polymer you can produce by removing
all units of exactly one type and fully reacting the result?

Let’s start by putting our last code in a function:

react <- function(vec, 
                   regex =  paste0(paste0(letters, LETTERS), "|", 
                                  paste0(LETTERS, letters), collapse = "|")){
  continue <- TRUE
  while (continue) {
    old_size <- nchar(vec)
    vec <- gsub(regex, "", vec) 
    continue <- nchar(vec) != old_size
  }
  vec
}

Let’s try with the examples from the website:

nchar(react("dbcCCBcCcD"))
## [1] 6
nchar(react("daAcCaCAcCcaDA"))
## [1] 8
nchar(react("dabAaBAaDA"))
## [1] 4
nchar(react("abAcCaCBAcCcaA"))
## [1] 6

Now we’ll combine pattern removal and reaction:

clean_and_react <- function(vec, pattern){
  react( gsub(pattern, "", vec) )
}

clean_and_react("dabAcCaCBAcCcaDA", "a|A") 
## [1] "dbCBcD"

Then, a function to get a tibble with: extracted pattern, and number of
characters:

library(tidyverse)
clean_and_react_and_count <- function(vec, pattern){
  tibble(
    pattern = pattern, 
    nchars = nchar(clean_and_react(vec, pattern))
  )
}
clean_and_react_and_count("dabAcCaCBAcCcaDA", "a|A") 
## # A tibble: 1 x 2
##   pattern nchars
##       
## 1 a|A          6

As it should take some time, let’s use {furrr} to do our calculation:

library(furrr)
## Loading required package: future
plan(multiprocess)
res <- future_map_dfr(paste0(LETTERS,"|", letters), ~ clean_and_react_and_count(vec, .x))

What’s the best solution?

top_n(res, -1)
## Selecting by nchars

## # A tibble: 1 x 2
##   pattern nchars
##       
## 1 F|f       5844

Day six, part one

Now we’re working with distance calculation. We’ve been provided a bunch
of coordinates. Once we have put these on a grid, we need to fill the
“empty cells” with the reference to the closest coordinate from our
input, by calculating the shortest manhattan
distance
.

day6 <- read_csv("input6.txt", col_names = c("V1", "V2"))
## Parsed with column specification:
## cols(
##   V1 = col_integer(),
##   V2 = col_integer()
## )
day6$id <- as.character(1:50)
day6
## # A tibble: 50 x 3
##       V1    V2 id   
##      
##  1   315   342 1    
##  2    59   106 2    
##  3    44   207 3    
##  4    52    81 4    
##  5   139   207 5    
##  6    93   135 6    
##  7   152   187 7    
##  8   271    47 8    
##  9   223   342 9    
## 10    50   255 10   
## # ... with 40 more rows

First of all, let’s get a list of all the “empty cells” we mentioned
before:

all_comb <- expand.grid(
  min(day6$V1):max(day6$V1), 
  min(day6$V2):max(day6$V2)
) %>% as_tibble()

The Manhattan distance function:

manat_dist <- function(x, y){
  abs(x - y)
}

A function to find the closer ID, given an x and a y:

closest_id <- function(x, y, df = day6){
  df %>%
    mutate(dist = manat_dist(x, V1) + manat_dist(y, V2) ) %>%
    top_n(-1, dist) %>%
    pull(id)
}

# Apply it on all our combination
cl <- future_pmap_chr(all_comb, function(...){
  x <- closest_id(..1, ..2)
  if (length(x) > 1) {
    NA
  } else {
    x
  }
}) 

And now, time to answer the puzzle:

What is the size of the largest area that isn’t infinite?

An infinite area is defined by the fact that at least one of its element
is on the edge of the grid (hence equal to the min or max of V1 or V2).

all_comb <- all_comb %>% 
  mutate(
    closest = cl, 
    max1 = max(Var1),
    max2 = max(Var2),
    min1 = min(Var1),
    min2 = min(Var2),
  )

# Get if each row are on the border
is_border <- pmap_lgl(all_comb, function(...){
  ..1 == ..4 | ..1 == ..6 | ..2 == ..5 | ..2 == ..7
})

all_comb %>% 
  mutate(is_border = is_border) %>%
  group_by(closest) %>%
  mutate(is_bord = any(is_border)) %>%
  filter(!is_bord) %>% 
  count(closest, sort = TRUE) %>% 
  ungroup() %>%
  top_n(1)
## Selecting by n

## # A tibble: 1 x 2
##   closest     n
##      
## 1 13       4290

Day six, part two

Now, we need to compute the distance of each point on the grid to each
coordinates, and to answer this question:

What is the size of the region containing all locations which have a
total distance to all given coordinates of less than 10000?

In other word, we get the distance to each coordinate on each cell, and
keep only the one with a total Manhattan distance which is less than
1000:

# Create a function to compute all distances
all_dist <- function(x, y, df = day6){
  df %>%
    mutate(dist = manat_dist(x, V1) + manat_dist(y, V2) )
}

# And use it on the all_comb table
future_pmap(all_comb, function(...){
  all_dist(..1, ..2)
}) %>% 
  map_int(~sum(.x$dist)) %>%
  keep(~ .x < 10000) %>% 
  length()
## [1] 37318

To leave a comment for the author, please follow the link and comment on their blog: Colin Fay.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)