**Colin Fay**, and kindly contributed to R-bloggers)

Solving the first four puzzles of Advent of

Code with R.

[Disclaimer] Obviously, this post contains a big spoiler about Advent

of Code, as it gives solutions for solving day 1 and 2.

## Advent of Code

Advent of Code is an Advent calendar of small programming puzzles for

a variety of skill sets and skill levels that can be solved in any

programming language you like. About Advent of

Code

## Day one, part one

Let’s read the inputs for day one:

```
inputs <- read.csv("input.txt", header = FALSE)[, 1]
```

The story behind this is, to sum up: you’ve got a frequency of X, and a

series of moves to apply to this frequency.

So, here’s the challenge: we are starting at 0, then each element of our

input vector is a “shift”, so the idea is to do `0 + v[1] + v[2] + ... +`

. We need to get the value of the last element.

v[n]

```
head(inputs, 10)
```

```
## [1] 11 14 10 -8 -13 -2 8 14 -11 -4
```

### Function factory

[Disclaimer bis] I’ve been looking for an opportunity to write a blog

post about this for a while. It seems like I finally found one 😉

Here’s a way to do it, using closures, functional programming, and

vectorisation. We’re going to create a function factory, which is a

family of high order functions that takes a vector and returns a

function See there for more info.

```
with_start_at <- function(start){
start <- start
function(x){
start <<- start + x
start
}
}
```

Let’s test against the example from AoC.

```
tail( vapply(c(1, 1, 1), with_start_at(0), numeric(1)), 1)
```

```
## [1] 3
```

```
tail( vapply(c(1, 1, -2), with_start_at(0), numeric(1)), 1)
```

```
## [1] 0
```

```
tail( vapply(c(-1, -2, -3), with_start_at(0), numeric(1)), 1)
```

```
## [1] -6
```

And finally, let’s get the answer to the question:

```
tail( vapply(inputs, with_start_at(0), numeric(1)), 1)
```

```
## [1] 402
```

### How does it work

Basically, each function creates its own environment. When you create a

function `Y`

inside another function `X`

, the function `Y`

, when

returned, carries along the **environment in which it has been
created**. In other word, when I create a function

`Y`

with`with_start_at`

, the `start`

object, even if in the parent environmentat the time of the function creation, is still accessible to the

`Y`

function via the

`<<-`

operator.In a conceptual sense, it’s like defining a `start`

inside your global

environment and changing it on each iteration of the `vapply`

. The big

difference being that you don’t mess with the global environment:

everything happening inside the function stays inside the function.

### Benchmark

Let’s benchmark a for loop against our functional solution:

```
bench::mark(
iterations = 100,
check = FALSE,
`for` = {
start <- 0
for (i in seq_along(inputs)){
start <- start + inputs[i]
}
start},
functional = {
tail( vapply(inputs, with_start_at(0), numeric(1)), 1)
}
)
```

```
##
##
## expression min mean median max
## ----------- --------- --------- --------- -------
## for 1.54ms 2.01ms 1.75ms 5.28ms
## functional 623.91µs 803.32µs 735.35µs 1.45ms
```

### Side note

Once I had completed this, I looked for other ways to do it, and

stumbled upon a solution with `cumsum`

, which is basically what my

solution does, but faster and more idiomatic (so yes, better). The only

difference being (on a more general sense), that the `cumsum`

does only

one task (totally valid for our case), but on the long run is less

flexibile: the functional solution would allow us to start at any entry

point, and also to apply any kind of transformation to our

`start_object`

.

## Day one, part two

Now, the second challenge is: given our list of frequencies and starting

at 0, which frequency is encountered twice first? Note: our list of

moves can be repeated as many time as possible until we find our answer.

On this one, I decided to go for the “brute force” solution – the idea

is to: + start from the vector of moves + apply the `with_start_at`

+

test if there is any duplicated + if there are, extract the duplicated,

get the first element, and `break`

+ if there is none, multiply `i`

by

10 and rep the original move `i`

times + repeat until I have an answer

```
i <- 1
while(TRUE){
alls <- sapply( rep(inputs, i), with_start_at(0) )
dup <- duplicated(alls)
if ( any(dup)){
x <- head( alls[dup], 1)
break
} else {
i <- i * 10
}
}
x
```

```
## [1] 481
```

## Day Two

### Day Two, part one

Day two is a string manipulation

challenge.

```
inputs <- read.csv("input2.txt", header = FALSE, stringsAsFactors = FALSE)[, 1]
head(inputs, 10)
```

```
## [1] "qcsnyvpigkxmrdawlfdefotxbh" "qcsnyvligkymrdawljujfotxbh"
## [3] "qmsnyvpigkzmrnawzjuefotxbh" "qosnyvpigkzmrnawljuefouxbh"
## [5] "qcsnhlpigkzmrtawljuefotxbh" "qcsnyvpigkzmrdapljuyfotxih"
## [7] "qcsnbvpiokzmrdawljuerotxbh" "qcfnyvmigkzmrdawljuefotdbh"
## [9] "qcsnynpigkzmrdawljuefptxbp" "qcsgyapigkzmrdawljuafotxbh"
```

Here is the challenge: get the number of words with a letter repeated

twice, and the number of words with a letter repeated three time, and

multiply these two words.

Let’s use `{purrr}`

and `{dplyr}`

to do

that.

```
library(tidyverse)
```

```
## ── Attaching packages ────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
## ✔ tibble 1.4.2 ✔ dplyr 0.7.8
## ✔ tidyr 0.8.1 ✔ stringr 1.3.1
## ✔ readr 1.1.1 ✔ forcats 0.3.0
## ── Conflicts ───────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
```

```
# Create a splitter and counter
split_and_count <- compose(
table,
flatten_chr,
partial(strsplit, split = "")
)
# Split them all
map(inputs, split_and_count) %>%
map_df(function(x){
tibble(
two = any(x == 2),
three = any(x == 3)
)
}) %>%
# Get the result
summarise(
res = sum(two) * sum(three)
)
```

```
## # A tibble: 1 x 1
## res
##
```
## 1 6422

### Day Two, part two

The second part of the challenge was to detect, from the inputs, the two

“words” with just one different character, and extract from these two

the common pattern.

Good news: I’ve been working on `{tidystringdist}`

, a tidy framework

for the `{stringdist}`

package, which can be used to compute exactly

that. We’ll use the Levenshtein

distance, which (to

sum up) compute the number of permutations necessary to move from one

string to another. As we are looking for two words with a permutation of

one character, the Levenshtein distance should be of one.

```
library(tidystringdist)
res <- tidy_comb_all(inputs) %>%
tidy_stringdist(method = "lv") %>%
filter(lv == 1)
res
```

```
## # A tibble: 1 x 3
## V1 V2 lv
##
```
## 1 qcslyvphgkrmddawljuefotxbh qcslyvphgkrmrdawljuefotxbh 1

Now we’ve got that, we are looking for the common string. Here’s a quick

and dirty way to do this:

```
splt <- compose(
flatten_chr,
partial(strsplit, split = "")
)
v1 <- splt(res$V1)
v2 <- splt(res$V2)
v1[v1 == v2] %>% paste(collapse = "")
```

```
## [1] "qcslyvphgkrmdawljuefotxbh"
```

**leave a comment**for the author, please follow the link and comment on their blog:

**Colin Fay**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...