Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There’s no time to lose, so here comes another Advent of Code puzzle solved using R. Day 5 challenge, here we come! What are we expected to do?

The polymer is formed by smaller units which, when triggered, react with each other such that two adjacent units of the same type and opposite polarity are destroyed. Units’ types are represented by letters; units’ polarity is represented by capitalization. For instance, r and R are units with the same type but opposite polarity, whereas r and s are entirely different types and do not react.

For example: In aA, a and A react, leaving nothing behind. In abBA, bB destroys itself, leaving aA. As above, this then destroys itself, leaving nothing. In abAB, no two adjacent units are of the same type, and so nothing happens. In aabAAB, even though aa and AA are of the same type, their polarities match, and so nothing happens. Now, consider a larger example, dabAcCaCBAcCcaDA:

dabAcCaCBAcCcaDA The first ‘cC’ is removed.
dabAaCBAcCcaDA This creates ‘Aa’, which is removed.
dabCBAcCcaDA Either ‘cC’ or ‘Cc’ are removed (the result is the same).
dabCBAcaDA No further actions can be taken.

After all possible reactions, the resulting polymer contains 10 units. How many units remain after fully reacting the polymer you scanned?

OK, this one actually doesn’t sound too bad. Basically, we need to keep removing 2-letter combinations of same lowercase and capital letters until the letter sequence stops changing in length.

Let’s have a look at the data:

library(tidyverse)

raw_input <- read_delim('day5-raw-input.txt', delim = '\t', col_names = FALSE) %>%
as.character()

glimpse(raw_input)
##  chr "nVvNOoHhiJjlLSuUvVsHhIpiIPIhHgqNVvnQGaAbBiFfbBFQqKaAkfsNxXnpPrSIikKsBcCbdDaMDdmAkKEebBNnRpPGgxXyYJjrRSvKkoOlLrR"| __truncated__

As expected, just a string of letters. How long?

# original length
nchar(raw_input)
## [1] 50000

50000-letters long! I wish I knew that was the right length from the beginning – this puzzle took me way longer than necessary to solve because I didn’t copy a complete sequence into the .txt file!

Anyway, instead of coming up with a fancy regex pattern I hard-coded all possible combinations of letters that we want to remove from the sequence. Not elegant but very effective.

pattern <- c('Aa|aA|Bb|bB|Cc|cC|Dd|dD|Ee|eE|Ff|fF|Gg|gG|Hh|hH|Ii|iI|Jj|jJ|Kk|kK|Ll|lL|Mm|mM|Nn|nN|Oo|oO|Pp|pP|Qq|qQ|Rr|rR|Ss|sS|Tt|tT|Uu|uU|Vv|vV|Ww|wW|Xx|xX|Yy|yY|Zz|zZ')

Let’s test on the first example if this pattern works:

## test the example

## dabAcCaCBAcCcaDA  The first 'cC' is removed.
## dabAaCBAcCcaDA    This creates 'Aa', which is removed.
## dabCBAcCcaDA      Either 'cC' or 'Cc' are removed (the result is the same).

str_remove_all(pattern) %>% nchar()
## [1] 10

Yes! We get the right answer after 2 rounds of ‘removals’! Now, we don’t know how many rounds need to be applied to the puzzle dataset in order to reach the final solution, so how do we go about it? It’s a perfect case for a repeat loop. In such loop we keep repeating a specified operation until a condition is met. In our case, we want to keep applying the removal of ‘reactive units’ until the length of the letter sequence stops changing. And once it happens, we want to know how long is the final (shortest) sequence. This will be our solution to the puzzle:

# now, put it in the repeat loop
# final solution

input <- raw_input

repeat {
output <- str_remove_all(input, pattern)
if (nchar(input) == nchar(output)) {
print(nchar(output));
break
} else input <- output;
}
## [1] 11194

Now, that’s what I call an ELEGANT solution!