# Secret Santa is a graph traversal problem

November 28, 2017
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last week at Thanksgiving, my family drew names from a hat for our annual game
of Secret Santa. Actually, it wasn’t a hat but you know what I mean. (Now that I
think about it, I don’t think I’ve ever seen names drawn from a literal hat
before!) In our family, the rules of Secret Santa are pretty simple:

• The players’ names are put in “a hat”.
• Players randomly draw a name from a hat, become that person’s Secret Santa,
• If a player draws their own name, they draw again.

Once again this year, somebody asked if we could just use an app or a website to
handle the drawing for Secret Santa. Or I could write a script to do it I
thought to myself. The problem nagged at the back of my mind for the past few
days. You could just shuffle the names… no, no, no. It’s trickier than that.

In this post, I describe a couple of algorithms for Secret Santa sampling using
R and directed graphs. I use the
DiagrammeR package which creates
graphs from dataframes of nodes and edges, and I liberally use
dplyr verbs to manipulate tables of
edges.

If you would like a more practical way to use R for Secret Santa, including
automating the process of drawing names and emailing players, see
this blog post.

## Making a graph, connecting nodes twice

Let’s start with a subset of my family’s game of just five players. I assign
each name a unique ID number.

library(DiagrammeR)
library(magrittr)
library(dplyr, warn.conflicts = FALSE)

players <- tibble::tribble(
~ Name, ~ Number,
"Jeremy", 1,
"TJ", 2,
"Jonathan", 3,
"Alex", 4,
"Marissa", 5)

We can think of the players as nodes in a directed graph. An edge connecting two
players indicates a “gives-to” (Secret Santa) relationship. Suppose I drew
Marissa’s name. Then the graph will have an edge connecting me to her. In the
code below, I use DiagrammeR to create a graph by combining a node dataframe
(create_node_df()) and an edge dataframe (create_edge_df()).

nodes <- create_node_df(
n = nrow(players),
type = players\$Name,
label = players\$Name)

tj_drew_marissa <- create_edge_df(
from = 2,
to = 5,
rel = "gives-to",
color = "#FF4136",
penwidth = 1)

create_graph(nodes, tj_drew_marissa) %>%
render_graph()

%0

2->5

1

Jeremy

2

TJ

3

Jonathan

4

Alex

5

Marissa

Before the game starts, anyone could draw anyone else’s name, so let’s visualize
all possible gives-to relations. We can do this by using combn(n, 2) to
generate all n-choose-2 pairs and creating two edges for each pair.

combn(players\$Name, 2)
#>      [,1]     [,2]       [,3]     [,4]      [,5]       [,6]   [,7]
#> [1,] "Jeremy" "Jeremy"   "Jeremy" "Jeremy"  "TJ"       "TJ"   "TJ"
#> [2,] "TJ"     "Jonathan" "Alex"   "Marissa" "Jonathan" "Alex" "Marissa"
#>      [,8]       [,9]       [,10]
#> [1,] "Jonathan" "Jonathan" "Alex"
#> [2,] "Alex"     "Marissa"  "Marissa"

# All the edge-manipulating functions in this post take an optional take `...`
# argument for setting the style of edges.
create_all_giving_edges <- function(xs, ...) {
aes_options <- quos(...)
pairs <- combn(seq_along(xs), 2)
# Each column from `combn()` is a pair. We make an edge moving down the column
# and another edge up the column by having each row as a `from` index and a
# `to` index.
from <- c(pairs[1, ], pairs[2, ])
to <- c(pairs[2, ], pairs[1, ])

create_edge_df(from = from, to = to) %>%
mutate(!!! aes_options) %>%
as_tibble()
}

all_possible_edges <- create_all_giving_edges(
players\$Name,
penwidth = .5,
color = "#CCCCCC90")

create_graph(nodes, all_possible_edges) %>%
render_graph()

%0

1->2

1->3

1->4

1->5

2->1

2->3

2->4

2->5

3->1

3->2

3->4

3->5

4->1

4->2

4->3

4->5

5->1

5->2

5->3

5->4

1

Jeremy

2

TJ

3

Jonathan

4

Alex

5

Marissa

### A fast, simple solution is a Hamiltonian path

Do you need to organize a gift-giving drawing for a group of people? The easiest
solution is to shuffle the names and have the first name give to the second name,
the second to the third, and so on with last name giving looping back around to
the first name. This solution is equivalent to walking through the graph and
visiting every node just once. Such a path is called a
Hamiltonian path.

Here we find a Hamiltonian path and create a helper function overwrite_edges()
to update the edges that fall on the path.

overwrite_edges <- function(old_df, new_df) {
old_df %>%
anti_join(new_df, by = c("from", "to")) %>%
bind_rows(new_df)
}

loop_from <- sample(seq_along(xs))
# last name gives to first
loop_to <- c(loop_from[-1], loop_from[1])
create_edge_df(from = loop_from, to = loop_to, ...)
}

# For reproducible blogging
set.seed(11282017)

players\$Name,
rel = "gives-to",
color = "#FF4136",
penwidth = 1
)

all_possible_edges %>%
overwrite_edges(hamiltonian_edges) %>%
create_graph(nodes, .) %>%
render_graph()

%0

1->2

1->3

1->4

1->5

2->1

2->3

2->4

2->5

3->1

3->2

3->4

3->5

4->1

4->2

4->3

4->5

5->1

5->2

5->3

5->4

1

Jeremy

2

TJ

3

Jonathan

4

Alex

5

Marissa

As promised, the red paths loop through all nodes exactly once. No one is their
own gift giver :white_check_mark:, and everyone has an incoming red path
:white_check_mark: and an outgoing red path :white_check_mark:. Very nice.
Actually… let’s put that checklist into a validation function.

# Check for valid gift-giving edges
indices <- sort(unique(indices))
pairs <- edge_df %>% filter(rel == "gives-to")
no_self_loop <- !any(pairs\$from == pairs\$to)
exhaustive_from <- isTRUE(all.equal(sort(pairs\$from), indices))
exhaustive_to <- isTRUE(all.equal(sort(pairs\$to), indices))
all(no_self_loop, exhaustive_from, exhaustive_to)
}

#> [1] TRUE

Despite its elegance, this solution does not simulate drawing names from a
hat! Because each node is visited only once, there is no backtracking, so there
there is no reciprocal gift-giving or other sub-circuits in the graph.

Whether you think this is a bad thing is a matter of preference. Personally, I
would like all remaining pairs to be equally probable at each step of the
drawing. This is not the case when backtracking is not allowed. (For example, if
I draw Marissa, then all of the remaining edges are not equally likely because
P(Marissa draws TJ | TJ draws Marissa) = 0.)

## Okay, do the hat-drawing thing already

Let’s think about what happens when I draw Marissa’s name from a nice big red
Santa hat.

• The edge from TJ to Marissa is fixed. (I drew her name.)
• All other edges from TJ become illegal. (I can’t draw any more names.)
• All other edges onto Marissa become illegal. (No one else can draw her name
either.)

To simulate a single hat-drawing, we randomly select a legal edge, fix it, and
delete all illegal edges. Let’s work through a couple of examples.

First, we need some helper functions.

draw_secret_santa_edge <- function(edge_df, ...) {
aes_options <- quos(...)

edge_df %>%
filter(rel != "gives-to") %>%
sample_n(1) %>%
mutate(!!! aes_options)
}

find_illegal_edges <- function(edge_df, edge, ...) {
aes_options <- quos(...)

outgoing <- edge_df %>%
filter(from %in% edge\$from)

incoming <- edge_df %>%
filter(to %in% edge\$to)

# The one edge that is not illegal is in both
# the incoming and outgoing sets
to_keep <- dplyr::intersect(outgoing, incoming)

outgoing %>%
bind_rows(incoming) %>%
anti_join(to_keep, by = c("from", "to")) %>%
mutate(!!! aes_options)
}

Here we draw a single edge (red with fat arrow). All of the other edges that
point to the same node are illegal (navy) as are all of the edges that have the
same origin as the drawn edge.

current_pick <- draw_secret_santa_edge(
all_possible_edges,
rel = "gives-to",
color = "#FF4136",
penwidth = 1,
arrowsize = 1)

current_illegal_edges <- all_possible_edges %>%
find_illegal_edges(current_pick, color = "#001f3f", penwidth = .5)

all_possible_edges %>%
overwrite_edges(current_pick) %>%
overwrite_edges(current_illegal_edges) %>%
create_graph(nodes, .) %>%
render_graph(title = "Selected vs. illegal")

%0 Selected vs. illegal

1->2

1->3

1->4

1->5

2->1

2->3

2->4

2->5

3->1

3->2

3->4

3->5

4->1

4->2

4->3

4->5

5->1

5->2

5->3

5->4

1

Jeremy

2

TJ

3

Jonathan

4

Alex

5

Marissa

We delete those illegal edges and leaving us with the following graph.

edges_after_pick1 <- all_possible_edges %>%
overwrite_edges(current_pick %>% mutate(arrowsize = NULL)) %>%
anti_join(current_illegal_edges, by = "id")

create_graph(nodes, edges_after_pick1) %>%
render_graph(title = "After one draw")

%0 After one draw

1->2

1->3

1->4

1->5

2->1

3->2

3->4

3->5

4->2

4->3

4->5

5->2

5->3

5->4

1

Jeremy

2

TJ

3

Jonathan

4

Alex

5

Marissa

The name has been removed from the hat, and the graph is simpler now!

Let’s do it again. Draw a random legal edge (fat arrow) and identify all the
illegal paths (navy).

current_pick <- edges_after_pick1 %>%
draw_secret_santa_edge(
rel = "gives-to",
color = "#FF4136",
penwidth = 1,
arrowsize = 1)

current_illegal_edges <- edges_after_pick1 %>%
find_illegal_edges(edge = current_pick, color = "#001f3f", penwidth = .5)

edges_after_pick1 %>%
overwrite_edges(current_pick) %>%
overwrite_edges(current_illegal_edges) %>%
create_graph(nodes, .) %>%
render_graph(title = "Selected vs. illegal")

%0 Selected vs. illegal

1->2

1->3

1->4

1->5

2->1

3->2

3->4

3->5

4->2

4->3

4->5

5->2

5->3

5->4

1

Jeremy

2

TJ

3

Jonathan

4

Alex

5

Marissa

After deleting illegal edges, the problem simplifies further.

edges_after_pick2 <- edges_after_pick1 %>%
overwrite_edges(current_pick %>% mutate(arrowsize = NULL)) %>%
anti_join(current_illegal_edges, by = "id")

create_graph(nodes, edges_after_pick2) %>%
render_graph(title = "After two draws")

%0 After two draws

1->2

1->4

1->5

2->1

3->2

3->4

3->5

4->3

5->2

5->4

1

Jeremy

2

TJ

3

Jonathan

4

Alex

5

Marissa

You can tell where this is going… Loop Town!

To finish up, we are going to repeat this process until there are only
gift-giving edges left. We will control the loop with this helper function
which tells us if there are any free edges remaining.

has_free_edge <- function(edge_df) {
edges_left <- edge_df %>% filter(rel != "gives-to") %>% nrow()
edges_left != 0
}

In the function below, the while-loop does the same steps as above: Randomly
selecting a free edge and removing illegal edges.

draw_edges_from_hat <- function(edge_df, ...) {
aes_options <- quos(...)
raw_edge_df <- edge_df
indices <- unique(c(raw_edge_df\$from, raw_edge_df\$to))

while (has_free_edge(edge_df)) {
pick <- edge_df %>%
draw_secret_santa_edge(!!! aes_options) %>%
mutate(rel = "gives-to")

illegal_edges <- edge_df %>%
find_illegal_edges(pick)

edge_df <- edge_df %>%
overwrite_edges(pick) %>%
anti_join(illegal_edges, by = "id")
}

warning(call. = FALSE, "Invalid drawing. Trying again.")
edge_df <- Recall(raw_edge_df, !!! aes_options)
}

edge_df
}

After the while-loop, the function checks if it has a valid set of gift edges,
and if it doesn’t, the function calls itself again. This bit is intended to
handle more constrained situations. Such as…

I lied above. Secret Santa is not so simple in family. For my generation (with
me, my siblings and our partners), there’s a rule that a player can’t draw their
partner’s name. Similarly, my nieces and nephews (and now also my child
:sparkling_heart:) have
their own gift exchange with an added constraint: A player can’t give their
sibling a gift. The elegant and simple Hamiltonian solution fails under these
constraints unless you write a special shuffling algorithm. Our hat-drawing
approach, however, handles this situation with minimal effort. Let’s work
through an example with my nieces and nephews (and :baby:!). To protect the very
young, I have replaced their names with Pokemon names.

Below we define the children and their families and do some data-wrangling so
that we have columns with the family at the start and end of each node.

niblings <- tibble::tribble(
~ Family, ~ Name, ~ Number,
"Water", "Squirtle", 1,
"Water", "Wartortle", 2,
"Electric", "Pikachu", 3,
"Plant", "Bulbasaur", 4,
"Plant", "Ivysaur", 5,
"Plant", "Venusaur", 6,
"Fighting", "Machamp", 7,
"Fighting", "Machoke", 8,
"Normal", "Ratata", 9,
"Normal", "Raticate", 10,
"Psychic", "Mew", 11,
"Psychic", "Mewtwo", 12)

nibling_edges <- create_all_giving_edges(
niblings\$Name,
penwidth = .5,
color = "#CCCCCC90") %>%
left_join(niblings, by = c("from" = "Number")) %>%
rename(from_fam = Family) %>%
select(-Name) %>%
left_join(niblings, by = c("to" = "Number")) %>%
rename(to_fam = Family) %>%
select(-Name) %>%
select(id, from, to, rel, from_fam, to_fam, everything())
nibling_edges
#> # A tibble: 132 x 8
#>       id  from    to            rel from_fam   to_fam penwidth     color
#>
#>  1     1     1     2 potential-gift    Water    Water      0.5 #CCCCCC90
#>  2     2     1     3 potential-gift    Water Electric      0.5 #CCCCCC90
#>  3     3     1     4 potential-gift    Water    Plant      0.5 #CCCCCC90
#>  4     4     1     5 potential-gift    Water    Plant      0.5 #CCCCCC90
#>  5     5     1     6 potential-gift    Water    Plant      0.5 #CCCCCC90
#>  6     6     1     7 potential-gift    Water Fighting      0.5 #CCCCCC90
#>  7     7     1     8 potential-gift    Water Fighting      0.5 #CCCCCC90
#>  8     8     1     9 potential-gift    Water   Normal      0.5 #CCCCCC90
#>  9     9     1    10 potential-gift    Water   Normal      0.5 #CCCCCC90
#> 10    10     1    11 potential-gift    Water  Psychic      0.5 #CCCCCC90
#> # ... with 122 more rows

In the graph below, we draw an olive edge to connect edge pair of siblings.
These edges are illegal before we even start drawing names.

sibling_edges <- nibling_edges %>%
filter(from_fam == to_fam) %>%
mutate(
rel = "sibling",
color = "#3D9970",
penwidth = 1)

# Update edges that represent siblings
nibling_edges <- nibling_edges %>%
overwrite_edges(sibling_edges)

nibling_nodes <- create_node_df(
n = nrow(niblings),
type = niblings\$Name,
label = niblings\$Name)

nibling_edges %>%
overwrite_edges(sibling_edges) %>%
create_graph(nibling_nodes, .) %>%
render_graph(height = 400)

%0

1->2

1->3

1->4

1->5

1->6

1->7

1->8

1->9

1->10

1->11

1->12

2->1

2->3

2->4

2->5

2->6

2->7

2->8

2->9

2->10

2->11

2->12

3->1

3->2

3->4

3->5

3->6

3->7

3->8

3->9

3->10

3->11

3->12

4->1

4->2

4->3

4->5

4->6

4->7

4->8

4->9

4->10

4->11

4->12

5->1

5->2

5->3

5->4

5->6

5->7

5->8

5->9

5->10

5->11

5->12

6->1

6->2

6->3

6->4

6->5

6->7

6->8

6->9

6->10

6->11

6->12

7->1

7->2

7->3

7->4

7->5

7->6

7->8

7->9

7->10

7->11

7->12

8->1

8->2

8->3

8->4

8->5

8->6

8->7

8->9

8->10

8->11

8->12

9->1

9->2

9->3

9->4

9->5

9->6

9->7

9->8

9->10

9->11

9->12

10->1

10->2

10->3

10->4

10->5

10->6

10->7

10->8

10->9

10->11

10->12

11->1

11->2

11->3

11->4

11->5

11->6

11->7

11->8

11->9

11->10

11->12

12->1

12->2

12->3

12->4

12->5

12->6

12->7

12->8

12->9

12->10

12->11

1

Squirtle

2

Wartortle

3

Pikachu

4

Bulbasaur

5

Ivysaur

6

Venusaur

7

Machamp

8

Machoke

9

Ratata

10

Raticate

11

Mew

12

Mewtwo

The solution for this trickier, more constrained version of the problem is to
delete the illegal edges beforehand so that they can never be drawn from the
hat. After that, the same algorithm works as before.

nibling_edges %>%
filter(rel != "sibling") %>%
draw_edges_from_hat(color = "#FF4136") %>%
create_graph(nibling_nodes, .) %>%
render_graph(height = 500)
#> Warning: Invalid drawing. Trying again.

#> Warning: Invalid drawing. Trying again.

%0

1->4

2->6

3->9

4->3

5->11

6->8

7->1

8->12

9->7

10->2

11->10

12->5

1

Squirtle

2

Wartortle

3

Pikachu

4

Bulbasaur

5

Ivysaur

6

Venusaur

7

Machamp

8

Machoke

9

Ratata

10

Raticate

11

Mew

12

Mewtwo

Like usual, I’m not sure how to close one of these blog posts. I guess: When a
problem involves relations among individuals, always consider whether there is a
graph hiding in the background. Even the simple process of drawing names from a
hat for Secret Santa describes a graph: a graph of gift-giving relations among
individuals. This graph wasn’t obvious to me until I thought about Hamilitonian
path solution. Hey, wait a minute, that’s a big gift-giving circle! It’s like
some kind of network… ooooooh.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.