Bayesian Search Models

March 13, 2014
By

(This article was first published on jacobsimmering.com, and kindly contributed to R-bloggers)

Bayesian Search Theory

The US had a pretty big problem on their hands in 1966. Two planes had hit each
other during a in-flight refueling and crashed. Normally, this would be an
unfortunate thing and terrible for the families of those involved in the crash
but otherwise fairly limited in importance. However, in this case, the plane
being refueled was carrying four hydrogen
bombs
.

Of the four bombs, only three were found on land. Two had their non-nuclear
payload explode and the third was intact and embedded in a river bank.
As a result, the fourth bomb was thought to be in the sea. A local man reported
that he saw the bomb enter the water and that was basically all of the
information that was available to find the bomb.

The Navy started a search using Bayesian search theory. Bayesian search theory
recognizes that there are two factors that contribute to the probability of
finding a lost item in a given location: the probability that the object is in
a given location and the probability of locating the object given that it is in
the searched location. When looking for objects in difficult places, such as
under water or in rough terrain, it may be very difficult to find an object
even if it exists.

Bayesian search theory starts by by expressing the probability that an object
$$O$$ will be found in a location $$(x, y)$$ is as by the product $$P(O)$$ and
$$P(D | O)$$ where $$D$$ is the event of detection of the object $$O$$.

This makes sense intuitively. For instance, the probability of me finding a
scarp of paper is smaller in my office which is overflowing with scraps of paper
than in my kitchen despite the fact that my office is much more likely to
contain the scrap of paper.

To use Bayesian search theory to find an object, you would calculate the
probability that the object will be found in each location on the basis of the
detection probability and the prior probability. You would then search first
in the location most likely to result in you successfully finding the object.
Assuming you don't find the object in that location, you can update the
probability of finding that object by iteratively updating the probability with
Bayes theorem.

To make this concrete, suppose we have a particle in a 2D box. Let the box be
61 units by 61 units wide and divided into 3721 “cells” of 1 square unit.
Suppose that during each time-step, the particle draws a move for each x and y
from a multivariate normal centered at zero with variance of 1 and covariate of
0. (Although the grid is divided into boxes, the particle is not limited to
discrete values). After $$t$$ time steps, the pdf for the location of the
particle is simply $$N(\mathbf{\mu}, t\mathbf{\Sigma})$$ as it is the result of the
summation of a set of independent normal RV. We will transform this to the
the grid instead of using contours. If we let the particle “walk” for 100 time
steps, the resulting pmf/pdf looks like

d <- data.frame(x = rep(seq(-30, 30), each = 61), y = rep(seq(-30, 30), times = 61))
d$PrP <- dnorm(d$x, 0, sqrt(100)) * dnorm(d$y, 0, sqrt(100)) ggplot(d, aes(x = x, y = y, z = PrP)) + geom_point(aes(alpha = PrP)) + stat_contour()  As you might expect as $$N(0, 100)$$ distribution is fairly flat. Now lets come up with a model for $$P(D | O)$$. In the case of the Navy's search, this was a function of depth in that grid location. For simplicity, I'm going to code detection probability as a function of distance from a given point. I placed a detector station at $$(10, 10)$$ and in that grid square the detector has a 97.5% chance of detecting the particle if it is there. The detection probability decays with distance. For this, I simply modeled as it $$0.975 * 0.925^d$$ where $$d$$ is the shortest linear distance to the detector. The detection power then looks something like detectionPower <- function(x, y, dx = 10, dy = 5, p0 = 0.975, d = 0.925) { x2 <- x - dx y2 <- y - dx r <- sqrt(x2^2 + y2^2) power <- p0 * d^r } d$PrD <- detectionPower(d$x, d$y)
ggplot(d, aes(x = x, y = y, z = PrD)) + geom_point(aes(alpha = PrD)) + stat_contour(binwidth = 0.1)


The product of these two functions is proportional to the probability that a
single search of an grid will find the particle. This is shown below

d$valueOfSearch <- d$PrP * d$PrD ggplot(d, aes(x = x, y = y, z = d$valueOfSearch)) + geom_point(aes(alpha = d$valueOfSearch)) + stat_contour()  Putting all three graphics together nd <- data.frame(x = rep(d$x, 3), y = rep(d$y, 3), value = c(d$PrP, d$PrD, d$valueOfSearch),
metric = gl(3, nrow(d), labels = c("PDF of Object", "Detection Prob", "Value of Search")))
ggplot(nd, aes(x = x, y = y, z = value)) + stat_contour() + facet_grid(. ~ metric) +
scale_x_continuous(limits = c(-30, 30)) + scale_y_continuous(limits = c(-30,
30))


it is clear that the optimal areas to search are actually pulled slightly away
from the most probable areas because those areas are less likely to be fruitful
given the reduced values of $$P(D | O)$$.

Suppose in a given time unit, 100 areas can be searched using the allotted
budget and the particle is no longer moving. On the first day, it would be
optimal to search the 100 most likely spots

If the object is not found in the searched spots, the probability of finding the
object in that spot, by application of Bayes Theorem, becomes

$P(O) = \frac{P(O)_0 (1 - P(D | O))}{(1 - P(O)_0 + P(O)_0 (1 - P(D|O)))}$

And for the locations not searched, their probability is also revised upwards
(just like when the Monty Hall opens a door) according to

$P(O) = \frac{P(O)_0}{(1 - P(O)_0 + P(O)_0 (1 - P(D|O)))}$

Suppose that we search the first 100 locations, what does the updated search
value look like?

bayesUpdate <- function(searched, p0, pD) {
(p0 * (1 - searched * pD))/(1 - p0 + p0 * (1 - pD))
}
d$searched <- rank(-1 * d$valueOfSearch) <= 100
d$newSearchValue <- bayesUpdate(d$searched, d$PrP, d$PrD)
nd <- data.frame(x = rep(d$x, 2), y = rep(d$y, 2), valueOfSearch = c(d$valueOfSearch, d$newSearchValue), searched = rep(d$searched, 2), search = rep(c("Before Any Searching", "First Wave"), each = nrow(d))) nd$searched[nd$searched == FALSE] <- NA ggplot(nd, aes(x = x, y = y, z = valueOfSearch)) + stat_contour() + facet_grid(. ~ search) + geom_point(aes(color = searched, alpha = valueOfSearch))  The searched area becomes less probable than before, however, it does not go to zero after a single search. Additionally, points with higher prior probabilities of having the object remain relatively likely depsite having being searched. Furthermore, the whole distribution moved as a result of not finding the object in the first 100 locations. Using these new probabilities, you could update your search path and search the next 100 locations and so on until either the object is found or the probability of ever finding it in the search area is nearly zero. searchCount <- rep(0, nrow(d)) probInSearchArea <- numeric(1000) probFindingInGrid <- numeric(1000) p0 <- d$PrP * d$PrD pD <- d$PrD
for (i in 1:1000) {
searchLocations <- rank(-1 * p0) <= 100
searchCount <- searchCount + searchLocations
probInSearchArea[i] <- sum(p0[searchLocations])
probFindingInGrid[i] <- sum(p0)
p0 <- bayesUpdate(searchLocations, p0, pD)
}
nSearches <- data.frame(x = d$x, y = d$y, count = searchCount)
ggplot(nSearches, aes(x = x, y = y, z = count)) + stat_contour() + geom_point(aes(alpha = count))


This result makes sense. The areas most distant from the detection center
required the most searching as the areas closer can be more quickly ruled out
due to the greater values of $$P(D|O)$$.

Since each grid location is 1 square unit, the sum
of the probabilities for each unit is approximately equal to the integral of
the pdf over the 31 x 31 search area. Exploiting this, we can easily calculate
the probability of finding the object with the sum of the probabilities at
each point. (This approximates the integral fairly well as the grid is 1 unit
square which is fairly small compared to the variance and total grid size).

searchValue <- data.frame(searchNumber = 1:1000, marginalvalue = probInSearchArea,
cumulativeValue = probFindingInGrid)
ggplot(searchValue, aes(x = searchNumber)) + geom_line(aes(y = marginalvalue)) +
geom_line(aes(y = probFindingInGrid), lty = 2) + scale_y_continuous("Probability of Finding the Object") +
scale_x_continuous("Number of Unsuccessful Searches")


The probability of finding the object in the next search or within the
considered grid decreases as the number of failed searches increases. This is
one of the greatest features of Bayesian search theory: by estimating the
probability of ever finding the object given the search area and previous
efforts, you can determine when searching is no longer economically (or
otherwise) viable.

failure to find the object in a searched sector, Bayesian methods reduce the
time required to find an object, especially when the $$P(D | O)$$ is highly
variable. With some assumptions in the modeling, a Bayesian search model found
the end of a random walk in fewer steps than a search model based only on the
pdf of the random walk or a model based on the product of the random walk pdf
and the detection probability. You can see the complete code for these
simulations here.

By combining the relevant information from prior searching and the likelihood
of a successful search given the object is located in the area, Bayesian search
theory provides an method for the optimal deployment of search resources. Now,
off to figure out where I put my iPad.