R Solutions for Excel Puzzles
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Week 46 — Puzzles no. 323–328
First time after few weeks I decided to slightly change my way of publishing this format of content. I used to publish daily, then every 2 or 3 days, but I realized that summary of all puzzles weekly will be my way.
Host of our puzzles is publishing 7 puzzles a week: 5 for Excel (Mon-Fri) and 2 for Power Query (Sat-Sun). My plan is to show my solutions in two posts according to original tools. Excel puzzle series will contain 5 parts and PQ series — 2 parts. Of course, as I had little break with it today ther will be more puzzles to get on straight path.
Puzzles
Author: ExcelBI
Puzzle 323: content file
Puzzle 324: content file
Puzzle 325: content file
Puzzle 326: content file
Puzzle 327: content file
Puzzle 328: content file
Lets dive into solutions!
Puzzle 323
Goal of puzzle was to find all palindromic substrings in numbers represented like strings.
Loading data and libraries:
library(tidyverse) library(readxl) library(stringi) input = read_excel(“Substring Palindromes.xlsx”, range = “A1:A8”) test = read_excel(“Substring Palindromes.xlsx”, range = “B1:B8”)
Approach 1: tidyverse
is_palindrome <- function(s) { s == stri_reverse(s) } generate_substrings <- function(s) { n <- nchar(s) positions <- crossing(start = 1:(n-2), end = 3:n) %>% filter(end > start, end — start >= 2) substrings <- pmap_chr(positions, ~ substr(s, ..1, ..2)) palindromic_substrings <- substrings[map_lgl(substrings, is_palindrome)] palindromic_substrings } result = input %>% rowwise() %>% mutate( substrings = map(Numbers, generate_substrings), final = paste(substrings, collapse = “, “)) %>% select(Palindrome = final) %>% mutate(Palindrome = if_else(Palindrome == “”, NA, Palindrome)) %>% ungroup()
Approach 2: data.table
library(data.table) inputDT = setDT(input) is_palindrome <- function(s) { s == stringi::stri_reverse(s) } generate_substrings_dt <- function(s) { n <- nchar(s) positions <- CJ(start = 1:(n-2), end = 3:n)[end > start & end — start >= 2] substrings <- mapply(function(start, end) substr(s, start, end), positions$start, positions$end) palindromic_substrings <- substrings[vapply(substrings, is_palindrome, logical(1))] if (length(palindromic_substrings) == 0) { return(NA_character_) } palindromic_substrings } result_dt <- inputDT[, .(Palindrome = { substrings <- lapply(Numbers, generate_substrings) final <- sapply(substrings, function(x) paste(x, collapse = “, “)) if (final == “”) NA_character_ else final }), by = 1:nrow(inputDT)]
Approach 3: base R
is_palindrome <- function(s) { s == stringi::stri_reverse(s) } generate_substrings_base <- function(s) { n <- nchar(s) substrings <- NULL for (start in 1:(n — 2)) { for (end in (start + 2):n) { substrings <- c(substrings, substr(s, start, end)) } } palindromic_substrings <- substrings[sapply(substrings, is_palindrome)] palindromic_substrings } generate_palindromes_for_row <- function(numbers) { substrings <- lapply(numbers, generate_substrings_base) final <- sapply(substrings, function(x) paste(x, collapse = “, “)) if (final == “”) return(NA) else return(final) } input <- as.data.frame(input) # Ensure ‘input’ is a data frame input$Palindrome <- apply(input, 1, function(row) generate_palindromes_for_row(row[“Numbers”]))
Validation:
identical(test, result) #> [1] TRUE identical(test$Palindrome, result_dt$Palindrome) #> [1] TRUE identical(test$Palindrome, input$Palindrome) #> [1] TRUE
Puzzle 324
Goal of this puzzle was to find subjects in two columns (like two classes next to each other) that are in distance of maximum one slot up or down in column.
Load data and libraries:
library(tidyverse) library(readxl) library(data.table) input = read_excel(“Plus Minus 1 Row.xlsx”, range = “A1:B20”) test = read_excel(“Plus Minus 1 Row.xlsx”, range = “C1:C6”)
Approach 1: tidyverse
r1 = input %>% select(s = 1) %>% mutate(n = row_number()) r2 = input %>% select(s = 2) %>% mutate(n = row_number()) result = r1 %>% left_join(r2, by =c("s")) %>% mutate(diff = abs(n.x - n.y)) %>% filter(diff <= 1) %>% select(s)
Approach 2: data.table
inputDT = setDT(input) r1_dt <- inputDT[, .(s = .SD[[1]], n1 = .I)] r2_dt <- inputDT[, .(s = .SD[[2]], n2 = .I)] result_dt <- r1_dt[r2_dt, on = .(s), nomatch = 0][, diff := abs(n1 - n2)] result_dt <- result_dt[diff <= 1, .(s)]
Approach 3: base R
r1_base <- data.frame(s = input[[1]], n = seq_len(nrow(input))) r2_base <- data.frame(s = input[[2]], n = seq_len(nrow(input))) result_base <- merge(r1_base, r2_base, by = "s", all.x = TRUE) result_base$diff <- abs(result_base$n.x - result_base$n.y) result_base <- result_base[result_base$diff <= 1, "s", drop = FALSE]
Validation:
identical(sort(test$`Answer Expected`), sort(result$s)) #> [1] TRUE identical(sort(test$`Answer Expected`), sort(result_dt$s)) #> [1] TRUE identical(sort(test$`Answer Expected`), sort(result_base$s)) #> [1] TRUE
Puzzle 325
In this puzzle we need to complete given sequence of numbers which like this fence on picture has some spaces and lacks.
Read data and libraries:
library(tidyverse) library(stringr) library(readxl) library(data.table) input = read_excel("Missing Number in AP_2.xlsx", range = "A1:A7") test = read_excel("Missing Number in AP_2.xlsx", range = "B1:B7")
Approach 1: tidyverse
find_missing_numbers <- function(input_string) { elements <- str_split(input_string, ",\\s*")[[1]] numbers <- map(elements, ~ if (str_detect(.x, "\\d+")) as.numeric(.x) else NA_real_) missing_indices <- which(is.na(numbers)) non_missing_indices <- which(!is.na(numbers)) if (length(non_missing_indices) >= 2) { first_point <- non_missing_indices[1] last_point <- non_missing_indices[length(non_missing_indices)] common_difference <- (numbers[last_point][[1]] - numbers[first_point][[1]]) / (last_point - first_point) } else { common_difference <- 0 } numbers[missing_indices] <- map(missing_indices, ~ numbers[first_point][[1]] + (.x - first_point) * common_difference) missing_numbers_str <- numbers[missing_indices] %>% map_chr(as.character) %>% str_c(collapse = ", ") return(missing_numbers_str) } result = input %>% mutate(missing_numbers = map_chr(AP, find_missing_numbers))
Approach 2: base R function
As in some previous articles if big part would look similar in base R and data.frame, I would use only one approach. Function is translated to base R, and I will only call her in data.table expression.
find_missing_numbers_base <- function(input_string) { elements <- unlist(strsplit(input_string, ",\\s*")) numbers <- sapply(elements, function(x) if (grepl("\\d+", x)) as.numeric(x) else NA_real_) missing_indices <- which(is.na(numbers)) non_missing_indices <- which(!is.na(numbers)) if (length(non_missing_indices) >= 2) { first_point <- non_missing_indices[1] last_point <- non_missing_indices[length(non_missing_indices)] common_difference <- (numbers[last_point] - numbers[first_point]) / (last_point - first_point) } else { common_difference <- 0 } numbers[missing_indices] <- sapply(missing_indices, function(i) numbers[first_point] + (i - first_point) * common_difference) missing_numbers_str <- paste(numbers[missing_indices], collapse = ", ") return(missing_numbers_str) } inputDT = setDT(input) resultDT <- inputDT[, .(missing_numbers = sapply(AP, find_missing_numbers))]
Validation
identical(test$`Answer Expected`, result$missing_numbers) #> [1] TRUE identical(test$`Answer Expected`, resultDT$missing_numbers) #> [1] TRUE
Puzzle 326
Those beautiful spiral takes into mind Fibonacci numbers. And yes indeed, this puzzle is related to Fibonacci numbers. From given numbers we have to leave only “non-Fibonacci” numbers.
Load data and libraries
library(tidyverse) library(readxl) library(data.table) input = read_excel("Non Fibonacci Numbers.xlsx", range = "A1:A10") test = read_excel("Non Fibonacci Numbers.xlsx", range = "B1:B5")
As function which I wrote is already in base R and it is probably the best approach I will cite function here and in approaches I will show only calling it in three syntaxes.
generate_fibonacci_to_limit<- function(){ fib <- c(1, 2) while (tail(fib, 1) < 1e7) { fib <- c(fib, sum(tail(fib, 2))) } return(fib) } fib = generate_fibonacci_to_limit()
Approach 1: tidyverse
result = input %>% mutate(is_fibonacci = ifelse(Numbers %in% fib, "Yes", "No")) %>% filter(is_fibonacci == "No") %>% select(Numbers)
Approach 2: data.table
inputDT = setDT(input) result_dt <- inputDT[, .(Numbers, is_fibonacci = ifelse(Numbers %in% fib, "Yes", "No"))] result_dt <- result_dt[is_fibonacci == "No", .(Numbers)]
Approach 3: base R
input$is_fibonacci <- ifelse(input$Numbers %in% fib, "Yes", "No") result_base <- input[input$is_fibonacci == "No", "Numbers", drop = FALSE]
Validation
identical(result$Numbers, test$`Answer Expected`) #> TRUE identical(result_dt$Numbers, test$`Answer Expected`) #> TRUE identical(result_base$Numbers, test$`Answer Expected`) #> TRUE
Puzzle 327
In puzzle #327 we need to find students with top 3 scores. It is not the same as 3 top students, because there could be more than one student that score max.
Loading data and libraries
library(tidyverse) library(readxl) library(data.table) input = read_excel("Highest Marks Names Subjects.xlsx", range = "A1:E10") test = read_excel("Highest Marks Names Subjects.xlsx", range = "G1:I7")
Approach 1: tidyverse
result = input %>% pivot_longer(-c(Names), names_to = "Subjects", values_to = "Marks") %>% mutate(rank = dense_rank(desc(Marks))) %>% filter(rank <= 3) %>% arrange(desc(Marks), Names, Subjects) %>% select(-rank)
Approach 2: data.table
inputDT = setDT(input) result_dt <- melt(inputDT, id.vars = "Names", variable.name = "Subjects", value.name = "Marks") result_dt <- result_dt[, rank := frank(-Marks, ties.method = "dense")][ rank <= 3 ][ order(-Marks, Names, Subjects) ][, -"rank", with = FALSE]
Approach 3: base R
result_base <- reshape(input, varying = list(names(input)[-1]), v.names = "Marks", timevar = "Subjects", times = names(input)[-1], direction = "long") result_base <- result_base[order(result_base$Marks, decreasing = TRUE), ] unique_marks_base <- unique(result_base$Marks) result_base$rank <- match(result_base$Marks, sort(unique_marks_base, decreasing = TRUE)) result_base <- subset(result_base, rank <= 3) result_base <- result_base[order(-result_base$Marks, result_base$Names, result_base$Subjects), ] result_base <- result_base[, c("Names", "Subjects", "Marks")]
Validation
Validation as ussual (by identical function) would need much more code to unify structure. That’s why I promise… I checked visually and all looks the same. :-)
Puzzle 328
In this puzzle we need to find duplicated numbers in string and remove, but only first occurance of duplication.
Read data and libraries
library(tidyverse) library(readxl) library(data.table) input = read_excel("Remove Minimum.xlsx", range = "A1:A7") test = read_excel("Remove Minimum.xlsx", range = "B1:B7")
Aproach 1: tidyverse
remove_last_min <- function(x) { nums <- str_split(x, ",\\s*") %>% map(~as.numeric(.x)) %>% .[[1]] min_num <- min(nums) last_min_index <- max(which(nums == min_num)) modified_nums <- nums[-last_min_index] if (length(modified_nums) == 0) { NA_character_ } else { str_c(modified_nums, collapse = ", ") } } result = input %>% mutate(result = map_chr(String, remove_last_min))
Approach 2: base function, data.table call
remove_last_min_base <- function(x) { nums <- as.numeric(unlist(strsplit(x, ",\\s*"))) min_num <- min(nums) last_min_index <- max(which(nums == min_num)) modified_nums <- nums[-last_min_index] if (length(modified_nums) == 0) { return(NA_character_) } else { return(paste(modified_nums, collapse = ", ")) } } inputDT = setDT(input) result_dt <- inputDT[, .(result = sapply(String, remove_last_min_base))]
Validation
identical(result$result, test$`Answer Expected`) #> [1] TRUE identical(result_dt$result, test$`Answer Expected`) #> [1] TRUE
Conclusion
This is the first time using this approach, so let me know what do you think went good or wrong.
Look for Power Query Puzzle article as well.
Publishing plan:
Mon: Excel Puzzles from previous week
Tue: Power Query Puzzle from last weekend
Thu: General R article (currently small series about how to improve, optimize and change functions)
R Solutions for Excel Puzzles was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.