# Solving #AdventOfCode day 5 and 6 with R

**Colin Fay**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Solving the puzzles of Advent of Code with

R.

[Disclaimer] Obviously, this post contains a big spoiler about Advent

of Code, as it gives solutions for solving days quoted in the title.

## Advent of Code

Advent of Code is an Advent calendar of small programming puzzles for

a variety of skill sets and skill levels that can be solved in any

programming language you like. About Advent of

Code

## Day five, part one

The first part of the challenge for day 3 starts well: we have to read

in R a file with A LOT of letters 🙂

```
vec <- readLines("input5.txt")
substr(vec, 1, 30)
```

```
## [1] "MmMmMxcRrCmMAaEeiIMmopZzPHxEeH"
```

This character vector is a polymer, and it describes chemical reactions.

The polymer is formed by smaller units which, when triggered, react

with each other such that two adjacent units of the same type and

opposite polarity are destroyed. Units’ types are represented by

letters; units’ polarity is represented by capitalization. For

instance, r and R are units with the same type but opposite polarity,

whereas r and s are entirely different types and do not react.

So the first question was:

How many units remain after fully reacting the polymer you scanned?

Let’s use a regex to do that 🙂 First, we’ll create a regex that

describes any possible combination of upper/lower

letters:

```
regex <- paste0(paste0(letters, LETTERS), "|", paste0(LETTERS, letters), collapse = "|")
regex
```

```
## [1] "aA|Aa|bB|Bb|cC|Cc|dD|Dd|eE|Ee|fF|Ff|gG|Gg|hH|Hh|iI|Ii|jJ|Jj|kK|Kk|lL|Ll|mM|Mm|nN|Nn|oO|Oo|pP|Pp|qQ|Qq|rR|Rr|sS|Ss|tT|Tt|uU|Uu|vV|Vv|wW|Ww|xX|Xx|yY|Yy|zZ|Zz"
```

```
# Check that the regex is there
grepl(regex, vec)
```

```
## [1] TRUE
```

So, we’ll need to run a loop that removes this regex until there is

nothing to remove anymore. The idea is to get the number of characters

before the `gsub`

, perform the `sub`

, and when the removal is done,

count if the number of characters from before the subtraction is

different from the number of characters after (i.e, if it is the same,

there is nothing to `gsub`

anymore).

```
continue <- TRUE
while (continue) {
old_size <- nchar(vec)
vec <- gsub(regex, "", vec)
continue <- nchar(vec) != old_size
}
# Check that the regex is not there anymore
grepl(regex, vec)
```

```
## [1] FALSE
```

```
nchar(vec)
```

```
## [1] 9288
```

## Day five, part two

With part 2, we need to try to first remove, one by one, the couples of

`units`

(for example “aA”), and react the polymer without each couple.

Then, we need to find:

What is the length of the shortest polymer you can produce by removing

all units of exactly one type and fully reacting the result?

Let’s start by putting our last code in a function:

```
react <- function(vec,
regex = paste0(paste0(letters, LETTERS), "|",
paste0(LETTERS, letters), collapse = "|")){
continue <- TRUE
while (continue) {
old_size <- nchar(vec)
vec <- gsub(regex, "", vec)
continue <- nchar(vec) != old_size
}
vec
}
```

Let’s try with the examples from the website:

```
nchar(react("dbcCCBcCcD"))
```

```
## [1] 6
```

```
nchar(react("daAcCaCAcCcaDA"))
```

```
## [1] 8
```

```
nchar(react("dabAaBAaDA"))
```

```
## [1] 4
```

```
nchar(react("abAcCaCBAcCcaA"))
```

```
## [1] 6
```

Now we’ll combine pattern removal and reaction:

```
clean_and_react <- function(vec, pattern){
react( gsub(pattern, "", vec) )
}
clean_and_react("dabAcCaCBAcCcaDA", "a|A")
```

```
## [1] "dbCBcD"
```

Then, a function to get a tibble with: extracted pattern, and number of

characters:

```
library(tidyverse)
clean_and_react_and_count <- function(vec, pattern){
tibble(
pattern = pattern,
nchars = nchar(clean_and_react(vec, pattern))
)
}
clean_and_react_and_count("dabAcCaCBAcCcaDA", "a|A")
```

```
## # A tibble: 1 x 2
## pattern nchars
##
```
## 1 a|A 6

As it should take some time, let’s use `{furrr}`

to do our calculation:

```
library(furrr)
```

```
## Loading required package: future
```

```
plan(multiprocess)
res <- future_map_dfr(paste0(LETTERS,"|", letters), ~ clean_and_react_and_count(vec, .x))
```

What’s the best solution?

```
top_n(res, -1)
```

```
## Selecting by nchars
## # A tibble: 1 x 2
## pattern nchars
##
```
## 1 F|f 5844

## Day six, part one

Now we’re working with distance calculation. We’ve been provided a bunch

of coordinates. Once we have put these on a grid, we need to fill the

“empty cells” with the reference to the closest coordinate from our

input, by calculating the shortest manhattan

distance.

```
day6 <- read_csv("input6.txt", col_names = c("V1", "V2"))
```

```
## Parsed with column specification:
## cols(
## V1 = col_integer(),
## V2 = col_integer()
## )
```

```
day6$id <- as.character(1:50)
day6
```

```
## # A tibble: 50 x 3
## V1 V2 id
##
```
## 1 315 342 1
## 2 59 106 2
## 3 44 207 3
## 4 52 81 4
## 5 139 207 5
## 6 93 135 6
## 7 152 187 7
## 8 271 47 8
## 9 223 342 9
## 10 50 255 10
## # ... with 40 more rows

First of all, let’s get a list of all the “empty cells” we mentioned

before:

```
all_comb <- expand.grid(
min(day6$V1):max(day6$V1),
min(day6$V2):max(day6$V2)
) %>% as_tibble()
```

The Manhattan distance function:

```
manat_dist <- function(x, y){
abs(x - y)
}
```

A function to find the closer ID, given an x and a y:

```
closest_id <- function(x, y, df = day6){
df %>%
mutate(dist = manat_dist(x, V1) + manat_dist(y, V2) ) %>%
top_n(-1, dist) %>%
pull(id)
}
# Apply it on all our combination
cl <- future_pmap_chr(all_comb, function(...){
x <- closest_id(..1, ..2)
if (length(x) > 1) {
NA
} else {
x
}
})
```

And now, time to answer the puzzle:

What is the size of the largest area that isn’t infinite?

An infinite area is defined by the fact that at least one of its element

is on the edge of the grid (hence equal to the min or max of V1 or V2).

```
all_comb <- all_comb %>%
mutate(
closest = cl,
max1 = max(Var1),
max2 = max(Var2),
min1 = min(Var1),
min2 = min(Var2),
)
# Get if each row are on the border
is_border <- pmap_lgl(all_comb, function(...){
..1 == ..4 | ..1 == ..6 | ..2 == ..5 | ..2 == ..7
})
all_comb %>%
mutate(is_border = is_border) %>%
group_by(closest) %>%
mutate(is_bord = any(is_border)) %>%
filter(!is_bord) %>%
count(closest, sort = TRUE) %>%
ungroup() %>%
top_n(1)
```

```
## Selecting by n
## # A tibble: 1 x 2
## closest n
##
```
## 1 13 4290

## Day six, part two

Now, we need to compute the distance of each point on the grid to each

coordinates, and to answer this question:

What is the size of the region containing all locations which have a

total distance to all given coordinates of less than 10000?

In other word, we get the distance to each coordinate on each cell, and

keep only the one with a total Manhattan distance which is less than

1000:

```
# Create a function to compute all distances
all_dist <- function(x, y, df = day6){
df %>%
mutate(dist = manat_dist(x, V1) + manat_dist(y, V2) )
}
# And use it on the all_comb table
future_pmap(all_comb, function(...){
all_dist(..1, ..2)
}) %>%
map_int(~sum(.x$dist)) %>%
keep(~ .x < 10000) %>%
length()
```

```
## [1] 37318
```

**leave a comment**for the author, please follow the link and comment on their blog:

**Colin Fay**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.