R Solutions for Excel Puzzles

[This article was first published on Numbers around us - Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Week 46 — Puzzles no. 323–328

First time after few weeks I decided to slightly change my way of publishing this format of content. I used to publish daily, then every 2 or 3 days, but I realized that summary of all puzzles weekly will be my way.

Host of our puzzles is publishing 7 puzzles a week: 5 for Excel (Mon-Fri) and 2 for Power Query (Sat-Sun). My plan is to show my solutions in two posts according to original tools. Excel puzzle series will contain 5 parts and PQ series — 2 parts. Of course, as I had little break with it today ther will be more puzzles to get on straight path.

Puzzles

Author: ExcelBI

Puzzle 323: content file
Puzzle 324: content file
Puzzle 325: content file
Puzzle 326: content file
Puzzle 327: content file
Puzzle 328: content file

Lets dive into solutions!

Puzzle 323

Goal of puzzle was to find all palindromic substrings in numbers represented like strings.

Loading data and libraries:

library(tidyverse)
library(readxl)
library(stringi)

input = read_excel(“Substring Palindromes.xlsx”, range = “A1:A8”)
test = read_excel(“Substring Palindromes.xlsx”, range = “B1:B8”)

Approach 1: tidyverse

is_palindrome <- function(s) {
 s == stri_reverse(s)
} 

generate_substrings <- function(s) {
 n <- nchar(s)
 positions <- crossing(start = 1:(n-2), end = 3:n) %>%
 filter(end > start, end — start >= 2)
 substrings <- pmap_chr(positions, ~ substr(s, ..1, ..2))
 palindromic_substrings <- substrings[map_lgl(substrings, is_palindrome)]
 palindromic_substrings
}

result = input %>%
 rowwise() %>%
 mutate(
 substrings = map(Numbers, generate_substrings),
 final = paste(substrings, collapse = “, “)) %>%
 select(Palindrome = final) %>%
 mutate(Palindrome = if_else(Palindrome == “”, NA, Palindrome)) %>%
 ungroup()

Approach 2: data.table

library(data.table)

inputDT = setDT(input)

is_palindrome <- function(s) {
 s == stringi::stri_reverse(s)
}

generate_substrings_dt <- function(s) {
 n <- nchar(s)
 positions <- CJ(start = 1:(n-2), end = 3:n)[end > start & end — start >= 2]
 substrings <- mapply(function(start, end) substr(s, start, end), positions$start, positions$end)
 palindromic_substrings <- substrings[vapply(substrings, is_palindrome, logical(1))]
 if (length(palindromic_substrings) == 0) {
 return(NA_character_)
 }
 palindromic_substrings
}

result_dt <- inputDT[, .(Palindrome = {
 substrings <- lapply(Numbers, generate_substrings)
 final <- sapply(substrings, function(x) paste(x, collapse = “, “))
 if (final == “”) NA_character_ else final
}), by = 1:nrow(inputDT)]

Approach 3: base R

is_palindrome <- function(s) {
 s == stringi::stri_reverse(s)
} 

generate_substrings_base <- function(s) {
 n <- nchar(s)
 substrings <- NULL
 for (start in 1:(n — 2)) {
 for (end in (start + 2):n) {
 substrings <- c(substrings, substr(s, start, end))
 }
 }
 palindromic_substrings <- substrings[sapply(substrings, is_palindrome)]
 palindromic_substrings
} 

generate_palindromes_for_row <- function(numbers) {
 substrings <- lapply(numbers, generate_substrings_base)
 final <- sapply(substrings, function(x) paste(x, collapse = “, “))
 if (final == “”) return(NA) else return(final)
}

input <- as.data.frame(input) # Ensure ‘input’ is a data frame
input$Palindrome <- apply(input, 1, function(row) generate_palindromes_for_row(row[“Numbers”]))

Validation:

identical(test, result)
#> [1] TRUE

identical(test$Palindrome, result_dt$Palindrome)
#> [1] TRUE

identical(test$Palindrome, input$Palindrome)
#> [1] TRUE

Puzzle 324

Goal of this puzzle was to find subjects in two columns (like two classes next to each other) that are in distance of maximum one slot up or down in column.

Load data and libraries:

library(tidyverse)
library(readxl)
library(data.table)

input = read_excel(“Plus Minus 1 Row.xlsx”, range = “A1:B20”)
test = read_excel(“Plus Minus 1 Row.xlsx”, range = “C1:C6”)

Approach 1: tidyverse

r1 = input %>%
  select(s = 1) %>%
  mutate(n = row_number()) 

r2 = input %>%
  select(s = 2) %>%
  mutate(n = row_number())

result = r1 %>%
  left_join(r2, by =c("s")) %>%
  mutate(diff = abs(n.x - n.y)) %>%
  filter(diff <= 1) %>%
  select(s)

Approach 2: data.table

inputDT = setDT(input)
r1_dt <- inputDT[, .(s = .SD[[1]], n1 = .I)]
r2_dt <- inputDT[, .(s = .SD[[2]], n2 = .I)]
result_dt <- r1_dt[r2_dt, on = .(s), nomatch = 0][, diff := abs(n1 - n2)]
result_dt <- result_dt[diff <= 1, .(s)]

Approach 3: base R

r1_base <- data.frame(s = input[[1]], n = seq_len(nrow(input)))
r2_base <- data.frame(s = input[[2]], n = seq_len(nrow(input)))
result_base <- merge(r1_base, r2_base, by = "s", all.x = TRUE)
result_base$diff <- abs(result_base$n.x - result_base$n.y)
result_base <- result_base[result_base$diff <= 1, "s", drop = FALSE]

Validation:

identical(sort(test$`Answer Expected`), sort(result$s))
#> [1] TRUE

identical(sort(test$`Answer Expected`), sort(result_dt$s))
#> [1] TRUE

identical(sort(test$`Answer Expected`), sort(result_base$s))
#> [1] TRUE

Puzzle 325

In this puzzle we need to complete given sequence of numbers which like this fence on picture has some spaces and lacks.

Read data and libraries:

library(tidyverse)
library(stringr)
library(readxl)
library(data.table)

input = read_excel("Missing Number in AP_2.xlsx", range = "A1:A7")
test = read_excel("Missing Number in AP_2.xlsx", range = "B1:B7")

Approach 1: tidyverse

find_missing_numbers <- function(input_string) {
  elements <- str_split(input_string, ",\\s*")[[1]]
  numbers <- map(elements, ~ if (str_detect(.x, "\\d+")) as.numeric(.x) else NA_real_)
  
  missing_indices <- which(is.na(numbers))
  
  non_missing_indices <- which(!is.na(numbers))
  if (length(non_missing_indices) >= 2) {
    first_point <- non_missing_indices[1]
    last_point <- non_missing_indices[length(non_missing_indices)]
    common_difference <- (numbers[last_point][[1]] - numbers[first_point][[1]]) / (last_point - first_point)
  } else {
    common_difference <- 0
  }
  
  numbers[missing_indices] <- map(missing_indices, ~ numbers[first_point][[1]] + (.x - first_point) * common_difference)
  missing_numbers_str <- numbers[missing_indices] %>% map_chr(as.character) %>% str_c(collapse = ", ")
  return(missing_numbers_str)
}

result = input %>%
  mutate(missing_numbers = map_chr(AP, find_missing_numbers)) 

Approach 2: base R function

As in some previous articles if big part would look similar in base R and data.frame, I would use only one approach. Function is translated to base R, and I will only call her in data.table expression.

find_missing_numbers_base <- function(input_string) {
  elements <- unlist(strsplit(input_string, ",\\s*"))
  numbers <- sapply(elements, function(x) if (grepl("\\d+", x)) as.numeric(x) else NA_real_)
  
  missing_indices <- which(is.na(numbers))
  non_missing_indices <- which(!is.na(numbers))
  
  if (length(non_missing_indices) >= 2) {
    first_point <- non_missing_indices[1]
    last_point <- non_missing_indices[length(non_missing_indices)]
    common_difference <- (numbers[last_point] - numbers[first_point]) / (last_point - first_point)
  } else {
    common_difference <- 0
  }
  
  numbers[missing_indices] <- sapply(missing_indices, function(i) numbers[first_point] + (i - first_point) * common_difference)
  missing_numbers_str <- paste(numbers[missing_indices], collapse = ", ")
  return(missing_numbers_str)
}

inputDT = setDT(input)
resultDT <- inputDT[, .(missing_numbers = sapply(AP, find_missing_numbers))]

Validation

identical(test$`Answer Expected`, result$missing_numbers)  
#> [1] TRUE

identical(test$`Answer Expected`, resultDT$missing_numbers)
#> [1] TRUE

Puzzle 326

Those beautiful spiral takes into mind Fibonacci numbers. And yes indeed, this puzzle is related to Fibonacci numbers. From given numbers we have to leave only “non-Fibonacci” numbers.

Load data and libraries

library(tidyverse)
library(readxl)
library(data.table)

input = read_excel("Non Fibonacci Numbers.xlsx", range = "A1:A10")
test = read_excel("Non Fibonacci Numbers.xlsx", range = "B1:B5")

As function which I wrote is already in base R and it is probably the best approach I will cite function here and in approaches I will show only calling it in three syntaxes.

generate_fibonacci_to_limit<- function(){
  fib <- c(1, 2)
  while (tail(fib, 1) < 1e7) {
    fib <- c(fib, sum(tail(fib, 2)))
  }
  return(fib)
}

fib = generate_fibonacci_to_limit()

Approach 1: tidyverse

result = input %>% 
  mutate(is_fibonacci = ifelse(Numbers %in% fib, "Yes", "No")) %>%
  filter(is_fibonacci == "No") %>%
  select(Numbers)

Approach 2: data.table

inputDT = setDT(input)

result_dt <- inputDT[, .(Numbers, is_fibonacci = ifelse(Numbers %in% fib, "Yes", "No"))]
result_dt <- result_dt[is_fibonacci == "No", .(Numbers)]

Approach 3: base R

input$is_fibonacci <- ifelse(input$Numbers %in% fib, "Yes", "No")
result_base <- input[input$is_fibonacci == "No", "Numbers", drop = FALSE]

Validation

identical(result$Numbers, test$`Answer Expected`)
#> TRUE

identical(result_dt$Numbers, test$`Answer Expected`)
#> TRUE

identical(result_base$Numbers, test$`Answer Expected`)
#> TRUE

Puzzle 327

In puzzle #327 we need to find students with top 3 scores. It is not the same as 3 top students, because there could be more than one student that score max.

Loading data and libraries

library(tidyverse)
library(readxl)
library(data.table)

input = read_excel("Highest Marks Names Subjects.xlsx", range = "A1:E10")
test = read_excel("Highest Marks Names Subjects.xlsx", range = "G1:I7")

Approach 1: tidyverse

result = input %>%
  pivot_longer(-c(Names), names_to = "Subjects", values_to = "Marks") %>%
  mutate(rank = dense_rank(desc(Marks))) %>%
  filter(rank <= 3) %>%
  arrange(desc(Marks), Names, Subjects) %>%
  select(-rank)

Approach 2: data.table

 inputDT = setDT(input)
result_dt <- melt(inputDT, id.vars = "Names", variable.name = "Subjects", value.name = "Marks")
result_dt <- result_dt[, rank := frank(-Marks, ties.method = "dense")][
  rank <= 3
][
  order(-Marks, Names, Subjects)
][, -"rank", with = FALSE]

Approach 3: base R

result_base <- reshape(input, varying = list(names(input)[-1]), v.names = "Marks",
                       timevar = "Subjects", times = names(input)[-1], direction = "long")
result_base <- result_base[order(result_base$Marks, decreasing = TRUE), ]

unique_marks_base <- unique(result_base$Marks)
result_base$rank <- match(result_base$Marks, sort(unique_marks_base, decreasing = TRUE))

result_base <- subset(result_base, rank <= 3)
result_base <- result_base[order(-result_base$Marks, result_base$Names, result_base$Subjects), ]
result_base <- result_base[, c("Names", "Subjects", "Marks")]

Validation

Validation as ussual (by identical function) would need much more code to unify structure. That’s why I promise… I checked visually and all looks the same. :-)

Puzzle 328

In this puzzle we need to find duplicated numbers in string and remove, but only first occurance of duplication.

Read data and libraries

library(tidyverse)
library(readxl)
library(data.table)

input = read_excel("Remove Minimum.xlsx", range = "A1:A7")
test = read_excel("Remove Minimum.xlsx", range = "B1:B7")

Aproach 1: tidyverse

remove_last_min <- function(x) {
  nums <- str_split(x, ",\\s*") %>%
    map(~as.numeric(.x)) %>%
    .[[1]]
  
  min_num <- min(nums)
  last_min_index <- max(which(nums == min_num))
  modified_nums <- nums[-last_min_index]
  
  if (length(modified_nums) == 0) {
    NA_character_
  } else {
    str_c(modified_nums, collapse = ", ")
  }
}

result = input %>%
  mutate(result = map_chr(String, remove_last_min))

Approach 2: base function, data.table call

remove_last_min_base <- function(x) {
  nums <- as.numeric(unlist(strsplit(x, ",\\s*")))
  
  min_num <- min(nums)
  last_min_index <- max(which(nums == min_num))
  modified_nums <- nums[-last_min_index]
  
  if (length(modified_nums) == 0) {
    return(NA_character_)
  } else {
    return(paste(modified_nums, collapse = ", "))
  }
}

inputDT = setDT(input)
result_dt <- inputDT[, .(result = sapply(String, remove_last_min_base))]

Validation

identical(result$result, test$`Answer Expected`)
#> [1] TRUE

identical(result_dt$result, test$`Answer Expected`) 
#> [1] TRUE

Conclusion

This is the first time using this approach, so let me know what do you think went good or wrong.
Look for Power Query Puzzle article as well.

Publishing plan:

Mon: Excel Puzzles from previous week
Tue: Power Query Puzzle from last weekend
Thu: General R article (currently small series about how to improve, optimize and change functions)


R Solutions for Excel Puzzles was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.

To leave a comment for the author, please follow the link and comment on their blog: Numbers around us - Medium.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)