# Forecasting By Combining Expert Opinion

January 3, 2014
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

by Michael Helbraun

Michael is member of Revolution Analytics Sales Support team. In the following post, he shows how to synthesize a probability distribution from the opinion of multiple experts: an excellent way to construct a Bayesian prior.

There are lots of different ways to forecast.  Depending on whether there’s historical data, trend, or seasonality you might choose to start with a particular technique.  Assuming good domain expertise one effective method is combining expert opinion via Monte Carlo simulation to generate a stochastic forecast.  While this example is set up to combine 3 different people’s perspectives of what the number might be, this technique could also be used to combine domain expertise with traditional analytic techniques like time series, regression, neural networks, etc.

First we grab some estimates from our three experts:

Next we generate triangular distributions based on each of our expert’s opinions; we then randomly select one value from each trial:

The end result – a nicely merged stochastic estimate:

Michael's code (below) uses Revolution's RevoScaleR library. Notice that the rxSetComputeContext() function (line 22) instructs the computer to set up for parallel computation using the resources on the local machine, and the rxExec() function in line 26 executes the rtriangle() function in parallel. By just changing the compute context this same code could run in parallel using all of the resources of and LSF or Hadoop cluster.

```###############################################################################
##  																		 ##
##	Revolution R Enterprise - MCS Forecasting, combining expert opinion		 ##
##																			 ##
###############################################################################

# Clear out memory for a fresh run and load required packages
rm(list = ls())
library(triangle); library(distr); library (ggplot2)

expertOpinion <- rxImport(inData = inDataFile)

View(expertOpinion)

# Set simulation parameters
trials <- 1000
rxOptions(numCoresToUse = -1)
rxSetComputeContext("localpar")

# create individual triangular distributions
orderedTri <- function(expertNum, trials) {
revoFcast <- rxExec(FUN = rtriangle,
timesToRun = 1, n = trials,
a = expertOpinion\$Min[expertNum],
b = expertOpinion\$Max[expertNum],
c = expertOpinion\$MostLikely[expertNum],
return(revoFcast)
}

# create distribution for each of our experts
revoFcast = NA
for (i in 1:nrow(expertOpinion)) {
if (is.na(revoFcast)) {revoFcast <- orderedTri(i,trials)}
else revoFcast <-c(revoFcast,orderedTri(i,trials))
}

# prepare the results
revoFcast <-(data.frame(revoFcast))
names(revoFcast) <- paste("Expert", 1:nrow(expertOpinion), sep="")

# ensure that the results are uncorrelated
cor(revoFcast)

# create a combined probability distribution and select a forecast value from the prob weighted dist
combinedDist  <- function(trialNum) {
cDist <- DiscreteDistribution(supp = as.double(revoFcast[trialNum,]),
prob = expertOpinion\$Weighting/sum(expertOpinion\$Weighting))
rD <- r(cDist)  # variable to generate values from the dist
return(rD(1))	  # generate/select 1 value
}

merged <- rxExec(FUN = combinedDist, trialNum = rxElemArg(c(1:trials)),
execObjects = c("revoFcast","expertOpinion"),

# add the forecast to our working data set
merged <- data.frame(merged)
names(merged) <- NULL
revoFcast\$merged <- t(merged)

# chart the output
View(revoFcast)	# Look at our combined data set

# restructure the data for plotting
histVals <- data.frame(Value = c(revoFcast\$Expert1, revoFcast\$Expert2, revoFcast\$Expert3, revoFcast\$merged),
Source = c(rep(c("Expert1", "Expert2", "Expert3","Merged Opinion"), each = trials )))
names(histVals) = c("Value", "Source")

# draw our combined plot
ggplot(histVals, aes(Value, fill = Source)) + geom_density(alpha = 0.25) + ggtitle("Combined Expert Opinion")```

Created by Pretty R at inside-R.org

Download Expert Estimates the small data file used to drive Michael's simulation.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...