**Higher Order Functions**, and kindly contributed to R-bloggers)

Last week at Thanksgiving, my family drew names from a hat for our annual game

of Secret Santa. Actually, it wasn’t a hat but you know what I mean. (Now that I

think about it, I don’t think I’ve ever seen names drawn from a literal hat

before!) In our family, the rules of Secret Santa are pretty simple:

- The players’ names are put in “a hat”.
- Players randomly draw a name from a hat, become that person’s Secret Santa,

and get them a gift. - If a player draws their own name, they draw again.

Once again this year, somebody asked if we could just use an app or a website to

handle the drawing for Secret Santa. *Or I could write a script to do it* I

thought to myself. The problem nagged at the back of my mind for the past few

days. *You could just shuffle the names… no, no, no. It’s trickier than that.*

In this post, I describe a couple of algorithms for Secret Santa sampling using

R and directed graphs. I use the

DiagrammeR package which creates

graphs from dataframes of nodes and edges, and I liberally use

dplyr verbs to manipulate tables of

edges.

If you would like a more practical way to use R for Secret Santa, including

automating the process of drawing names and *emailing players*, see

this blog post.

## Making a graph, connecting nodes twice

Let’s start with a subset of my family’s game of just five players. I assign

each name a unique ID number.

```
library(DiagrammeR)
library(magrittr)
library(dplyr, warn.conflicts = FALSE)
players <- tibble::tribble(
~ Name, ~ Number,
"Jeremy", 1,
"TJ", 2,
"Jonathan", 3,
"Alex", 4,
"Marissa", 5)
```

We can think of the players as nodes in a directed graph. An edge connecting two

players indicates a “gives-to” (Secret Santa) relationship. Suppose I drew

Marissa’s name. Then the graph will have an edge connecting me to her. In the

code below, I use DiagrammeR to create a graph by combining a node dataframe

(`create_node_df()`

) and an edge dataframe (`create_edge_df()`

).

```
nodes <- create_node_df(
n = nrow(players),
type = players$Name,
label = players$Name)
tj_drew_marissa <- create_edge_df(
from = 2,
to = 5,
rel = "gives-to",
color = "#FF4136",
penwidth = 1)
create_graph(nodes, tj_drew_marissa) %>%
render_graph()
```

Before the game starts, anyone could draw anyone else’s name, so let’s visualize

all possible gives-to relations. We can do this by using `combn(n, 2)`

to

generate all n-choose-2 pairs and creating two edges for each pair.

```
combn(players$Name, 2)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] "Jeremy" "Jeremy" "Jeremy" "Jeremy" "TJ" "TJ" "TJ"
#> [2,] "TJ" "Jonathan" "Alex" "Marissa" "Jonathan" "Alex" "Marissa"
#> [,8] [,9] [,10]
#> [1,] "Jonathan" "Jonathan" "Alex"
#> [2,] "Alex" "Marissa" "Marissa"
# All the edge-manipulating functions in this post take an optional take `...`
# argument for setting the style of edges.
create_all_giving_edges <- function(xs, ...) {
aes_options <- quos(...)
pairs <- combn(seq_along(xs), 2)
# Each column from `combn()` is a pair. We make an edge moving down the column
# and another edge up the column by having each row as a `from` index and a
# `to` index.
from <- c(pairs[1, ], pairs[2, ])
to <- c(pairs[2, ], pairs[1, ])
create_edge_df(from = from, to = to) %>%
mutate(!!! aes_options) %>%
as_tibble()
}
all_possible_edges <- create_all_giving_edges(
players$Name,
rel = "potential-gift",
penwidth = .5,
color = "#CCCCCC90")
create_graph(nodes, all_possible_edges) %>%
render_graph()
```

### A fast, simple solution is a Hamiltonian path

Do you need to organize a gift-giving drawing for a group of people? The easiest

solution is to shuffle the names and have the first name give to the second name,

the second to the third, and so on with last name giving looping back around to

the first name. This solution is equivalent to walking through the graph and

visiting every node just once. Such a path is called a

Hamiltonian path.

Here we find a Hamiltonian path and create a helper function `overwrite_edges()`

to update the edges that fall on the path.

```
overwrite_edges <- function(old_df, new_df) {
old_df %>%
anti_join(new_df, by = c("from", "to")) %>%
bind_rows(new_df)
}
create_hamiltonian_gift_edges <- function(xs, ...) {
loop_from <- sample(seq_along(xs))
# last name gives to first
loop_to <- c(loop_from[-1], loop_from[1])
create_edge_df(from = loop_from, to = loop_to, ...)
}
# For reproducible blogging
set.seed(11282017)
hamiltonian_edges <- create_hamiltonian_gift_edges(
players$Name,
rel = "gives-to",
color = "#FF4136",
penwidth = 1
)
all_possible_edges %>%
overwrite_edges(hamiltonian_edges) %>%
create_graph(nodes, .) %>%
render_graph()
```

As promised, the red paths loop through all nodes exactly once. No one is their

own gift giver :white_check_mark:, and everyone has an incoming red path

:white_check_mark: and an outgoing red path :white_check_mark:. Very nice.

*Actually*… let’s put that checklist into a validation function.

```
# Check for valid gift-giving edges
has_valid_gift_edges <- function(edge_df, indices) {
indices <- sort(unique(indices))
pairs <- edge_df %>% filter(rel == "gives-to")
no_self_loop <- !any(pairs$from == pairs$to)
exhaustive_from <- isTRUE(all.equal(sort(pairs$from), indices))
exhaustive_to <- isTRUE(all.equal(sort(pairs$to), indices))
all(no_self_loop, exhaustive_from, exhaustive_to)
}
has_valid_gift_edges(hamiltonian_edges, all_possible_edges$from)
#> [1] TRUE
```

Despite its elegance, this solution does not simulate drawing names from a

hat! Because each node is visited only once, there is no backtracking, so there

there is no reciprocal gift-giving or other sub-circuits in the graph.

Whether you think this is a bad thing is a matter of preference. Personally, I

would like all remaining pairs to be equally probable at each step of the

drawing. This is not the case when backtracking is not allowed. (For example, if

I draw Marissa, then all of the remaining edges are not equally likely because

P(Marissa draws TJ | TJ draws Marissa) = 0.)

## Okay, do the hat-drawing thing already

Let’s think about what happens when I draw Marissa’s name from a nice big red

Santa hat.

- The edge from TJ to Marissa is fixed. (I drew her name.)
- All other edges from TJ become illegal. (I can’t draw any more names.)
- All other edges onto Marissa become illegal. (No one else can draw her name

either.)

To simulate a single hat-drawing, we randomly select a legal edge, fix it, and

delete all illegal edges. Let’s work through a couple of examples.

First, we need some helper functions.

```
draw_secret_santa_edge <- function(edge_df, ...) {
aes_options <- quos(...)
edge_df %>%
filter(rel != "gives-to") %>%
sample_n(1) %>%
mutate(!!! aes_options)
}
find_illegal_edges <- function(edge_df, edge, ...) {
aes_options <- quos(...)
outgoing <- edge_df %>%
filter(from %in% edge$from)
incoming <- edge_df %>%
filter(to %in% edge$to)
# The one edge that is not illegal is in both
# the incoming and outgoing sets
to_keep <- dplyr::intersect(outgoing, incoming)
outgoing %>%
bind_rows(incoming) %>%
anti_join(to_keep, by = c("from", "to")) %>%
mutate(!!! aes_options)
}
```

Here we draw a single edge (red with fat arrow). All of the other edges that

point to the same node are illegal (navy) as are all of the edges that have the

same origin as the drawn edge.

```
current_pick <- draw_secret_santa_edge(
all_possible_edges,
rel = "gives-to",
color = "#FF4136",
penwidth = 1,
arrowsize = 1)
current_illegal_edges <- all_possible_edges %>%
find_illegal_edges(current_pick, color = "#001f3f", penwidth = .5)
all_possible_edges %>%
overwrite_edges(current_pick) %>%
overwrite_edges(current_illegal_edges) %>%
create_graph(nodes, .) %>%
render_graph(title = "Selected vs. illegal")
```

We delete those illegal edges and leaving us with the following graph.

```
edges_after_pick1 <- all_possible_edges %>%
overwrite_edges(current_pick %>% mutate(arrowsize = NULL)) %>%
anti_join(current_illegal_edges, by = "id")
create_graph(nodes, edges_after_pick1) %>%
render_graph(title = "After one draw")
```

The name has been removed from the hat, and the graph is simpler now!

Let’s do it again. Draw a random legal edge (fat arrow) and identify all the

illegal paths (navy).

```
current_pick <- edges_after_pick1 %>%
draw_secret_santa_edge(
rel = "gives-to",
color = "#FF4136",
penwidth = 1,
arrowsize = 1)
current_illegal_edges <- edges_after_pick1 %>%
find_illegal_edges(edge = current_pick, color = "#001f3f", penwidth = .5)
edges_after_pick1 %>%
overwrite_edges(current_pick) %>%
overwrite_edges(current_illegal_edges) %>%
create_graph(nodes, .) %>%
render_graph(title = "Selected vs. illegal")
```

After deleting illegal edges, the problem simplifies further.

```
edges_after_pick2 <- edges_after_pick1 %>%
overwrite_edges(current_pick %>% mutate(arrowsize = NULL)) %>%
anti_join(current_illegal_edges, by = "id")
create_graph(nodes, edges_after_pick2) %>%
render_graph(title = "After two draws")
```

You can tell where this is going… Loop Town!

To finish up, we are going to repeat this process until there are only

gift-giving edges left. We will control the loop with this helper function

which tells us if there are any free edges remaining.

```
has_free_edge <- function(edge_df) {
edges_left <- edge_df %>% filter(rel != "gives-to") %>% nrow()
edges_left != 0
}
```

In the function below, the while-loop does the same steps as above: Randomly

selecting a free edge and removing illegal edges.

```
draw_edges_from_hat <- function(edge_df, ...) {
aes_options <- quos(...)
raw_edge_df <- edge_df
indices <- unique(c(raw_edge_df$from, raw_edge_df$to))
while (has_free_edge(edge_df)) {
pick <- edge_df %>%
draw_secret_santa_edge(!!! aes_options) %>%
mutate(rel = "gives-to")
illegal_edges <- edge_df %>%
find_illegal_edges(pick)
edge_df <- edge_df %>%
overwrite_edges(pick) %>%
anti_join(illegal_edges, by = "id")
}
if (!has_valid_gift_edges(edge_df, indices)) {
warning(call. = FALSE, "Invalid drawing. Trying again.")
edge_df <- Recall(raw_edge_df, !!! aes_options)
}
edge_df
}
```

After the while-loop, the function checks if it has a valid set of gift edges,

and if it doesn’t, the function calls itself again. This bit is intended to

handle more constrained situations. Such as…

## The nibling gift exchange

I lied above. Secret Santa is not so simple in family. For my generation (with

me, my siblings and our partners), there’s a rule that a player can’t draw their

partner’s name. Similarly, my nieces and nephews (and now also my child

:sparkling_heart:) have

their own gift exchange with an added constraint: A player can’t give their

sibling a gift. The elegant and simple Hamiltonian solution fails under these

constraints unless you write a special shuffling algorithm. Our hat-drawing

approach, however, handles this situation with minimal effort. Let’s work

through an example with my nieces and nephews (and :baby:!). To protect the very

young, I have replaced their names with Pokemon names.

Below we define the children and their families and do some data-wrangling so

that we have columns with the family at the start and end of each node.

```
niblings <- tibble::tribble(
~ Family, ~ Name, ~ Number,
"Water", "Squirtle", 1,
"Water", "Wartortle", 2,
"Electric", "Pikachu", 3,
"Plant", "Bulbasaur", 4,
"Plant", "Ivysaur", 5,
"Plant", "Venusaur", 6,
"Fighting", "Machamp", 7,
"Fighting", "Machoke", 8,
"Normal", "Ratata", 9,
"Normal", "Raticate", 10,
"Psychic", "Mew", 11,
"Psychic", "Mewtwo", 12)
nibling_edges <- create_all_giving_edges(
niblings$Name,
rel = "potential-gift",
penwidth = .5,
color = "#CCCCCC90") %>%
left_join(niblings, by = c("from" = "Number")) %>%
rename(from_fam = Family) %>%
select(-Name) %>%
left_join(niblings, by = c("to" = "Number")) %>%
rename(to_fam = Family) %>%
select(-Name) %>%
select(id, from, to, rel, from_fam, to_fam, everything())
nibling_edges
#> # A tibble: 132 x 8
#> id from to rel from_fam to_fam penwidth color
#>
```
#> 1 1 1 2 potential-gift Water Water 0.5 #CCCCCC90
#> 2 2 1 3 potential-gift Water Electric 0.5 #CCCCCC90
#> 3 3 1 4 potential-gift Water Plant 0.5 #CCCCCC90
#> 4 4 1 5 potential-gift Water Plant 0.5 #CCCCCC90
#> 5 5 1 6 potential-gift Water Plant 0.5 #CCCCCC90
#> 6 6 1 7 potential-gift Water Fighting 0.5 #CCCCCC90
#> 7 7 1 8 potential-gift Water Fighting 0.5 #CCCCCC90
#> 8 8 1 9 potential-gift Water Normal 0.5 #CCCCCC90
#> 9 9 1 10 potential-gift Water Normal 0.5 #CCCCCC90
#> 10 10 1 11 potential-gift Water Psychic 0.5 #CCCCCC90
#> # ... with 122 more rows

In the graph below, we draw an olive edge to connect edge pair of siblings.

These edges are illegal before we even start drawing names.

```
sibling_edges <- nibling_edges %>%
filter(from_fam == to_fam) %>%
mutate(
rel = "sibling",
color = "#3D9970",
penwidth = 1)
# Update edges that represent siblings
nibling_edges <- nibling_edges %>%
overwrite_edges(sibling_edges)
nibling_nodes <- create_node_df(
n = nrow(niblings),
type = niblings$Name,
label = niblings$Name)
nibling_edges %>%
overwrite_edges(sibling_edges) %>%
create_graph(nibling_nodes, .) %>%
render_graph(height = 400)
```

The solution for this trickier, more constrained version of the problem is to

delete the illegal edges beforehand so that they can never be drawn from the

hat. After that, the same algorithm works as before.

```
nibling_edges %>%
filter(rel != "sibling") %>%
draw_edges_from_hat(color = "#FF4136") %>%
create_graph(nibling_nodes, .) %>%
render_graph(height = 500)
#> Warning: Invalid drawing. Trying again.
#> Warning: Invalid drawing. Trying again.
```

Like usual, I’m not sure how to close one of these blog posts. I guess: When a

problem involves relations among individuals, always consider whether there is a

graph hiding in the background. Even the simple process of drawing names from a

hat for Secret Santa describes a graph: a graph of gift-giving relations among

individuals. This graph wasn’t obvious to me until I thought about Hamilitonian

path solution. *Hey, wait a minute, that’s a big gift-giving circle! It’s like
some kind of network… ooooooh.*

**leave a comment**for the author, please follow the link and comment on their blog:

**Higher Order Functions**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...