[This article was first published on R – Statistical Odds & Ends, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

My daughter received the board game First Orchard as a Christmas present and she’s hooked on it so far. In playing the game with her, a few probability/statistics questions came to mind. This post outlines how I answered some of them using simulation in R. All code for this blog post can be found here.

(In my googling I found that Matt Lane has an excellent blog post on this game, answering some of the questions that I was interested in.)

Gameplay

Before we get to the questions, let me give a quick explanation of how the game works (see Reference 1 for a more colorful explanation as well as an applet to play the game online).

• It’s a cooperative game, with all players playing against the board.
• The game starts with 16 fruit: 4 of each color (red, green, blue, yellow), and a raven at one end of a path that is 5 steps long.
• On each player’s turn, the player rolls a 6-sided die.
• If the die comes up red, green, blue or yellow, the player gets to “harvest” a fruit of that color if there are any left to harvest. If all 4 fruits of that color have been harvested, nothing happens.
• If the die shows a fruit basket, the player gets to harvest a fruit of any color.
• If the die shows a raven, the raven moves one step along the path.
• The game ends when either all the fruit are harvested (players win) or when the raven reaches the end of the path (raven wins).

As you can see this is a really simple game (hence it’s “suitable for 2+ years” rating). The only strategy is in choosing what fruit to take if a fruit basket shows up: see Reference 1 for some simulations for different strategies. Intuitively it seems like choosing the color with the most fruit remaining is the best strategy, and that’s what I bake into my code. (Is there a proof for this, and is this still true in more general circumstances described in Reference 1?)

Code for simulating the game

The state of the game can be captured in a numeric vector of length 5. The first 4 numbers refer to the number of fruit left for each color, and the 5th number keeps track of the number of steps the raven has taken so far. I created 3 functions to simulate one game of First Orchard (see full code here):

• SimulateTurn(state, verbose) takes one dice roll and updates the state of the game. For simplicity, if a 1-4 is rolled, a fruit is harvested from that corresponding tree. If 5 is rolled, the raven takes a step. A rolled 6 is taken to mean “fruit basket”, and I remove a fruit from the tree with the most remaining fruits.
• CheckGameState(state, max_raven_steps) checks if the game has ended or not, and if so, who won.
• SimulateGame(fruit_count, max_raven_steps, verbose) runs an entire game of First Orchard: while the game has not ended, run SimulateTurn. Once the game has ended, this function returns (i) who won, (ii) the number of turns taken, (iii) the number of steps the raven took, and (iv) the number of fruit left.

We allow for two game parameters to be defined by the user: the number of fruit of each type at the start of the game (fruit_count, default is 4) and the number of steps the raven must take in order for it to win (max_raven_steps, default is 5). The verbose option for these functions so that the user can see what happened in the game. The code below is an example of the output from SimulateGame:

set.seed(1)
results <- SimulateGame(fruit_count = 2, max_raven_steps = 3,
verbose = TRUE)
# Roll: 1 , State: 1,2,2,2,0
# Roll: 4 , State: 1,2,2,1,0
# Roll: 1 , State: 0,2,2,1,0
# Roll: 2 , State: 0,1,2,1,0
# Roll: 5 , State: 0,1,2,1,1
# Roll: 3 , State: 0,1,1,1,1
# Roll: 6 , State: 0,0,1,1,1
# Roll: 2 , State: 0,0,1,1,1
# Roll: 3 , State: 0,0,0,1,1
# Roll: 3 , State: 0,0,0,1,1
# Roll: 1 , State: 0,0,0,1,1
# Roll: 5 , State: 0,0,0,1,2
# Roll: 5 , State: 0,0,0,1,3
# Raven wins
# # of turns: 13
# # of steps raven took: 3
# # fruit left: 1


Simulation time!

What is the probability of winning the game? How does that change as we vary (i) the number of fruit of each color, and (ii) the number of steps the raven must take in order for the players to lose?

We let the number of fruit of each color vary from 1 to 8, and the number of steps the raven must take from 1 to 8. For each parameter setting, we simulate the game 10,000 times and compute the player win probability. We plot the results as a heatmap.

As one might expect, win probability goes down as the number of fruit increases and as the number of steps the raven must take decreases. For the original game (4 fruit, 5 steps), the win probability is approximately 62%. Sounds reasonable: we would like a win probability >50% so that kids will not get discouraged by too much losing, but not so high that they think the game is trivial.

For the original game (4 fruit, 5 steps), what is the expected number of steps until the game ends? Does this change depending on whether the player or the raven wins?

We simulate the game 100,000 times and keep track of the number of steps taken in each game. The shortest game took 5 steps while the longest took 45 steps, with the modal number of steps being 21 (it also happens to be the mean and median). Here is the histogram for all 100,000 runs:

Here are the histograms split by the outcome:

Games where the raven wins tend to be shorter than those when players win. Maybe that’s not too surprising, since a game where the raven wins needs just 5 steps, while a game where the players win needs at least 16 steps. On average, the game takes 19 steps for raven wins and 22 steps for player wins.

For the original game (4 fruit, 5 steps), given that the raven loses, what is distribution of the number of steps the raven has taken?

Because we programmed SimulateGame to return the number of steps the raven has taken as well, we don’t have to rerun the simulations: we can just use the 100,000 simulations we ran previously and look at the ones that the raven lost. Here is the histogram of steps the raven took in losing games, with the vertical red line representing the mean:

For the original game (4 fruit, 5 steps), given that the raven wins, what is distribution of the number of unharvested fruit?

Again, we can just use the 100,000 simulations we ran previously and look at the ones that the raven won. Here is the histogram along with the mean and median marked out with vertical lines:

The modal number of fruit left in player-losing games is 1: ugh tantalizingly close!

If there was no raven, how many turns would it take to harvest all the fruit?

“No raven” is the same as saying that the raven needs to take an infinite number of steps in order to win. Hence, we can use our existing simulation code with max_raven_steps = Inf to simulate this setting.

The shortest game took 16 turns while the longest game took 63 turns, with 22 and 24 turns being the modal and mean number of turns respectively. (In theory, a game could go on forever.) Here is the histogram:

References:

1. Matt Lane. (2018). Harvesting Wins.