**jacobsimmering.com**, and kindly contributed to R-bloggers)

# Bayesian Search Theory

The US had a pretty big problem on their hands in 1966. Two planes had hit each

other during a in-flight refueling and crashed. Normally, this would be an

unfortunate thing and terrible for the families of those involved in the crash

but otherwise fairly limited in importance. However, in this case, the plane

being refueled was carrying four hydrogen

bombs.

Of the four bombs, only three were found on land. Two had their non-nuclear

payload explode and the third was intact and embedded in a river bank.

As a result, the fourth bomb was thought to be in the sea. A local man reported

that he saw the bomb enter the water and that was basically all of the

information that was available to find the bomb.

The Navy started a search using Bayesian search theory. Bayesian search theory

recognizes that there are two factors that contribute to the probability of

finding a lost item in a given location: the probability that the object is in

a given location and the probability of locating the object given that it is in

the searched location. When looking for objects in difficult places, such as

under water or in rough terrain, it may be very difficult to find an object

even if it exists.

Bayesian search theory starts by by expressing the probability that an object

\( O \) will be found in a location \( (x, y) \) is as by the product \( P(O) \) and

\( P(D | O) \) where \( D \) is the event of detection of the object \( O \).

This makes sense intuitively. For instance, the probability of me finding a

scarp of paper is smaller in my office which is overflowing with scraps of paper

than in my kitchen despite the fact that my office is much more likely to

contain the scrap of paper.

To use Bayesian search theory to find an object, you would calculate the

probability that the object will be found in each location on the basis of the

detection probability and the prior probability. You would then search first

in the location most likely to result in you successfully finding the object.

Assuming you don't find the object in that location, you can update the

probability of finding that object by iteratively updating the probability with

Bayes theorem.

To make this concrete, suppose we have a particle in a 2D box. Let the box be

61 units by 61 units wide and divided into 3721 “cells” of 1 square unit.

Suppose that during each time-step, the particle draws a move for each x and y

from a multivariate normal centered at zero with variance of 1 and covariate of

0. (Although the grid is divided into boxes, the particle is not limited to

discrete values). After \( t \) time steps, the pdf for the location of the

particle is simply \( N(\mathbf{\mu}, t\mathbf{\Sigma}) \) as it is the result of the

summation of a set of independent normal RV. We will transform this to the

the grid instead of using contours. If we let the particle “walk” for 100 time

steps, the resulting pmf/pdf looks like

```
d <- data.frame(x = rep(seq(-30, 30), each = 61), y = rep(seq(-30, 30), times = 61))
d$PrP <- dnorm(d$x, 0, sqrt(100)) * dnorm(d$y, 0, sqrt(100))
ggplot(d, aes(x = x, y = y, z = PrP)) + geom_point(aes(alpha = PrP)) + stat_contour()
```

As you might expect as \( N(0, 100) \) distribution is fairly flat.

Now lets come up with a model for \( P(D | O) \). In the case of the Navy's search,

this was a function of depth in that grid location. For simplicity, I'm going to

code detection probability as a function of distance from a given point. I

placed a detector station at \( (10, 10) \) and in that grid square the detector

has a 97.5% chance of detecting the particle if it is there. The detection

probability decays with distance. For this, I simply modeled as it

\( 0.975 * 0.925^d \) where \( d \) is the shortest linear distance to the detector.

The detection power then looks something like

```
detectionPower <- function(x, y, dx = 10, dy = 5, p0 = 0.975, d = 0.925) {
x2 <- x - dx
y2 <- y - dx
r <- sqrt(x2^2 + y2^2)
power <- p0 * d^r
}
d$PrD <- detectionPower(d$x, d$y)
ggplot(d, aes(x = x, y = y, z = PrD)) + geom_point(aes(alpha = PrD)) + stat_contour(binwidth = 0.1)
```

The product of these two functions is proportional to the probability that a

single search of an grid will find the particle. This is shown below

```
d$valueOfSearch <- d$PrP * d$PrD
ggplot(d, aes(x = x, y = y, z = d$valueOfSearch)) + geom_point(aes(alpha = d$valueOfSearch)) +
stat_contour()
```

Putting all three graphics together

```
nd <- data.frame(x = rep(d$x, 3), y = rep(d$y, 3), value = c(d$PrP, d$PrD, d$valueOfSearch),
metric = gl(3, nrow(d), labels = c("PDF of Object", "Detection Prob", "Value of Search")))
ggplot(nd, aes(x = x, y = y, z = value)) + stat_contour() + facet_grid(. ~ metric) +
scale_x_continuous(limits = c(-30, 30)) + scale_y_continuous(limits = c(-30,
30))
```

it is clear that the optimal areas to search are actually pulled slightly away

from the most probable areas because those areas are less likely to be fruitful

given the reduced values of \( P(D | O) \).

Suppose in a given time unit, 100 areas can be searched using the allotted

budget and the particle is no longer moving. On the first day, it would be

optimal to search the 100 most likely spots

If the object is not found in the searched spots, the probability of finding the

object in that spot, by application of Bayes Theorem, becomes

\[ P(O) = \frac{P(O)_0 (1 - P(D | O))}{(1 - P(O)_0 + P(O)_0 (1 - P(D|O)))} \]

And for the locations not searched, their probability is also revised upwards

(just like when the Monty Hall opens a door) according to

\[ P(O) = \frac{P(O)_0}{(1 - P(O)_0 + P(O)_0 (1 - P(D|O)))} \]

Suppose that we search the first 100 locations, what does the updated search

value look like?

```
bayesUpdate <- function(searched, p0, pD) {
(p0 * (1 - searched * pD))/(1 - p0 + p0 * (1 - pD))
}
d$searched <- rank(-1 * d$valueOfSearch) <= 100
d$newSearchValue <- bayesUpdate(d$searched, d$PrP, d$PrD)
nd <- data.frame(x = rep(d$x, 2), y = rep(d$y, 2), valueOfSearch = c(d$valueOfSearch,
d$newSearchValue), searched = rep(d$searched, 2), search = rep(c("Before Any Searching",
"First Wave"), each = nrow(d)))
nd$searched[nd$searched == FALSE] <- NA
ggplot(nd, aes(x = x, y = y, z = valueOfSearch)) + stat_contour() + facet_grid(. ~
search) + geom_point(aes(color = searched, alpha = valueOfSearch))
```

The searched area becomes less probable than before, however, it does not go to

zero after a single search. Additionally, points with higher prior probabilities

of having the object remain relatively likely depsite having being searched.

Furthermore, the whole distribution moved as a result of not finding the object

in the first 100 locations.

Using these new probabilities, you could update your search path and search the

next 100 locations and so on until either the object is found or the probability

of ever finding it in the search area is nearly zero.

```
searchCount <- rep(0, nrow(d))
probInSearchArea <- numeric(1000)
probFindingInGrid <- numeric(1000)
p0 <- d$PrP * d$PrD
pD <- d$PrD
for (i in 1:1000) {
searchLocations <- rank(-1 * p0) <= 100
searchCount <- searchCount + searchLocations
probInSearchArea[i] <- sum(p0[searchLocations])
probFindingInGrid[i] <- sum(p0)
p0 <- bayesUpdate(searchLocations, p0, pD)
}
nSearches <- data.frame(x = d$x, y = d$y, count = searchCount)
ggplot(nSearches, aes(x = x, y = y, z = count)) + stat_contour() + geom_point(aes(alpha = count))
```

This result makes sense. The areas most distant from the detection center

required the most searching as the areas closer can be more quickly ruled out

due to the greater values of \( P(D|O) \).

Since each grid location is 1 square unit, the sum

of the probabilities for each unit is approximately equal to the integral of

the pdf over the 31 x 31 search area. Exploiting this, we can easily calculate

the probability of finding the object with the sum of the probabilities at

each point. (This approximates the integral fairly well as the grid is 1 unit

square which is fairly small compared to the variance and total grid size).

```
searchValue <- data.frame(searchNumber = 1:1000, marginalvalue = probInSearchArea,
cumulativeValue = probFindingInGrid)
ggplot(searchValue, aes(x = searchNumber)) + geom_line(aes(y = marginalvalue)) +
geom_line(aes(y = probFindingInGrid), lty = 2) + scale_y_continuous("Probability of Finding the Object") +
scale_x_continuous("Number of Unsuccessful Searches")
```

The probability of finding the object in the next search or within the

considered grid decreases as the number of failed searches increases. This is

one of the greatest features of Bayesian search theory: by estimating the

probability of ever finding the object given the search area and previous

efforts, you can determine when searching is no longer economically (or

otherwise) viable.

Additionally, by adjusting resources according to new information, such as the

failure to find the object in a searched sector, Bayesian methods reduce the

time required to find an object, especially when the \( P(D | O) \) is highly

variable. With some assumptions in the modeling, a Bayesian search model found

the end of a random walk in fewer steps than a search model based only on the

pdf of the random walk or a model based on the product of the random walk pdf

and the detection probability. You can see the complete code for these

simulations here.

By combining the relevant information from prior searching and the likelihood

of a successful search given the object is located in the area, Bayesian search

theory provides an method for the optimal deployment of search resources. Now,

off to figure out where I put my iPad.

**leave a comment**for the author, please follow the link and comment on his blog:

**jacobsimmering.com**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...