Solving #AdventOfCode day 5 and 6 with R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Solving the puzzles of Advent of Code with R.
[Disclaimer] Obviously, this post contains a big spoiler about Advent of Code, as it gives solutions for solving days quoted in the title.
Advent of Code
Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. About Advent of Code
Day five, part one
The first part of the challenge for day 3 starts well: we have to read in R a file with A LOT of letters 🙂
vec <- readLines("input5.txt") substr(vec, 1, 30) ## [1] "MmMmMxcRrCmMAaEeiIMmopZzPHxEeH"
This character vector is a polymer, and it describes chemical reactions.
The polymer is formed by smaller units which, when triggered, react with each other such that two adjacent units of the same type and opposite polarity are destroyed. Units’ types are represented by letters; units’ polarity is represented by capitalization. For instance, r and R are units with the same type but opposite polarity, whereas r and s are entirely different types and do not react.
So the first question was:
How many units remain after fully reacting the polymer you scanned?
Let’s use a regex to do that 🙂 First, we’ll create a regex that describes any possible combination of upper/lower letters:
regex <- paste0(paste0(letters, LETTERS), "|", paste0(LETTERS, letters), collapse = "|") regex ## [1] "aA|Aa|bB|Bb|cC|Cc|dD|Dd|eE|Ee|fF|Ff|gG|Gg|hH|Hh|iI|Ii|jJ|Jj|kK|Kk|lL|Ll|mM|Mm|nN|Nn|oO|Oo|pP|Pp|qQ|Qq|rR|Rr|sS|Ss|tT|Tt|uU|Uu|vV|Vv|wW|Ww|xX|Xx|yY|Yy|zZ|Zz" # Check that the regex is there grepl(regex, vec) ## [1] TRUE
So, we’ll need to run a loop that removes this regex until there is
nothing to remove anymore. The idea is to get the number of characters
before the gsub
, perform the sub
, and when the removal is done,
count if the number of characters from before the subtraction is
different from the number of characters after (i.e, if it is the same,
there is nothing to gsub
anymore).
continue <- TRUE while (continue) { old_size <- nchar(vec) vec <- gsub(regex, "", vec) continue <- nchar(vec) != old_size } # Check that the regex is not there anymore grepl(regex, vec) ## [1] FALSE nchar(vec) ## [1] 9288
Day five, part two
With part 2, we need to try to first remove, one by one, the couples of
units
(for example “aA”), and react the polymer without each couple.
Then, we need to find:
What is the length of the shortest polymer you can produce by removing all units of exactly one type and fully reacting the result?
Let’s start by putting our last code in a function:
react <- function(vec, regex = paste0(paste0(letters, LETTERS), "|", paste0(LETTERS, letters), collapse = "|")){ continue <- TRUE while (continue) { old_size <- nchar(vec) vec <- gsub(regex, "", vec) continue <- nchar(vec) != old_size } vec }
Let’s try with the examples from the website:
nchar(react("dbcCCBcCcD")) ## [1] 6 nchar(react("daAcCaCAcCcaDA")) ## [1] 8 nchar(react("dabAaBAaDA")) ## [1] 4 nchar(react("abAcCaCBAcCcaA")) ## [1] 6
Now we’ll combine pattern removal and reaction:
clean_and_react <- function(vec, pattern){ react( gsub(pattern, "", vec) ) } clean_and_react("dabAcCaCBAcCcaDA", "a|A") ## [1] "dbCBcD"
Then, a function to get a tibble with: extracted pattern, and number of characters:
library(tidyverse) clean_and_react_and_count <- function(vec, pattern){ tibble( pattern = pattern, nchars = nchar(clean_and_react(vec, pattern)) ) } clean_and_react_and_count("dabAcCaCBAcCcaDA", "a|A") ## # A tibble: 1 x 2 ## pattern nchars ## <chr> <int> ## 1 a|A 6
As it should take some time, let’s use {furrr}
to do our calculation:
library(furrr) ## Loading required package: future plan(multiprocess) res <- future_map_dfr(paste0(LETTERS,"|", letters), ~ clean_and_react_and_count(vec, .x))
What’s the best solution?
top_n(res, -1) ## Selecting by nchars ## # A tibble: 1 x 2 ## pattern nchars ## <chr> <int> ## 1 F|f 5844
Day six, part one
Now we’re working with distance calculation. We’ve been provided a bunch of coordinates. Once we have put these on a grid, we need to fill the “empty cells” with the reference to the closest coordinate from our input, by calculating the shortest manhattan distance.
day6 <- read_csv("input6.txt", col_names = c("V1", "V2")) ## Parsed with column specification: ## cols( ## V1 = col_integer(), ## V2 = col_integer() ## ) day6$id <- as.character(1:50) day6 ## # A tibble: 50 x 3 ## V1 V2 id ## <int> <int> <chr> ## 1 315 342 1 ## 2 59 106 2 ## 3 44 207 3 ## 4 52 81 4 ## 5 139 207 5 ## 6 93 135 6 ## 7 152 187 7 ## 8 271 47 8 ## 9 223 342 9 ## 10 50 255 10 ## # ... with 40 more rows
First of all, let’s get a list of all the “empty cells” we mentioned before:
all_comb <- expand.grid( min(day6$V1):max(day6$V1), min(day6$V2):max(day6$V2) ) %>% as_tibble()
The Manhattan distance function:
manat_dist <- function(x, y){ abs(x - y) }
A function to find the closer ID, given an x and a y:
closest_id <- function(x, y, df = day6){ df %>% mutate(dist = manat_dist(x, V1) + manat_dist(y, V2) ) %>% top_n(-1, dist) %>% pull(id) } # Apply it on all our combination cl <- future_pmap_chr(all_comb, function(...){ x <- closest_id(..1, ..2) if (length(x) > 1) { NA } else { x } })
And now, time to answer the puzzle:
What is the size of the largest area that isn’t infinite?
An infinite area is defined by the fact that at least one of its element is on the edge of the grid (hence equal to the min or max of V1 or V2).
all_comb <- all_comb %>% mutate( closest = cl, max1 = max(Var1), max2 = max(Var2), min1 = min(Var1), min2 = min(Var2), ) # Get if each row are on the border is_border <- pmap_lgl(all_comb, function(...){ ..1 == ..4 | ..1 == ..6 | ..2 == ..5 | ..2 == ..7 }) all_comb %>% mutate(is_border = is_border) %>% group_by(closest) %>% mutate(is_bord = any(is_border)) %>% filter(!is_bord) %>% count(closest, sort = TRUE) %>% ungroup() %>% top_n(1) ## Selecting by n ## # A tibble: 1 x 2 ## closest n ## <chr> <int> ## 1 13 4290
Day six, part two
Now, we need to compute the distance of each point on the grid to each coordinates, and to answer this question:
What is the size of the region containing all locations which have a total distance to all given coordinates of less than 10000?
In other word, we get the distance to each coordinate on each cell, and keep only the one with a total Manhattan distance which is less than 1000:
# Create a function to compute all distances all_dist <- function(x, y, df = day6){ df %>% mutate(dist = manat_dist(x, V1) + manat_dist(y, V2) ) } # And use it on the all_comb table future_pmap(all_comb, function(...){ all_dist(..1, ..2) }) %>% map_int(~sum(.x$dist)) %>% keep(~ .x < 10000) %>% length() ## [1] 37318
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.