**Revolutions**, and kindly contributed to R-bloggers)

by Michael Helbraun

*Michael is member of Revolution Analytics Sales Support team. In the following post, he shows how to synthesize a probability distribution from the opinion of multiple experts: an excellent way to construct a Bayesian prior.*

There are lots of different ways to forecast. Depending on whether there’s historical data, trend, or seasonality you might choose to start with a particular technique. Assuming good domain expertise one effective method is combining expert opinion via Monte Carlo simulation to generate a stochastic forecast. While this example is set up to combine 3 different people’s perspectives of what the number might be, this technique could also be used to combine domain expertise with traditional analytic techniques like time series, regression, neural networks, etc.

First we grab some estimates from our three experts:

Next we generate triangular distributions based on each of our expert’s opinions; we then randomly select one value from each trial:

The end result – a nicely merged stochastic estimate:

*Michael's code (below) uses Revolution's RevoScaleR library. Notice that the rxSetComputeContext() function (line 22) instructs the computer to set up for parallel computation using the resources on the local machine, and the rxExec() function in line 26 executes the rtriangle() function in parallel. By just changing the compute context this same code could run in parallel using all of the resources of and LSF or Hadoop cluster.*

############################################################################### ## ## ## Revolution R Enterprise - MCS Forecasting, combining expert opinion ## ## ## ############################################################################### # Clear out memory for a fresh run and load required packages rm(list = ls()) library(triangle); library(distr); library (ggplot2) # read input parameters bigDataDir <- "C:/Data/Demos/Datasets" bigDataDir <- "C:/..." inDataFile <- file.path(bigDataDir, "/Expert Estimates.csv") expertOpinion <- rxImport(inData = inDataFile) View(expertOpinion) # Set simulation parameters trials <- 1000 rxOptions(numCoresToUse = -1) rxSetComputeContext("localpar") # create individual triangular distributions orderedTri <- function(expertNum, trials) { revoFcast <- rxExec(FUN = rtriangle, timesToRun = 1, n = trials, a = expertOpinion$Min[expertNum], b = expertOpinion$Max[expertNum], c = expertOpinion$MostLikely[expertNum], packagesToLoad = "triangle") return(revoFcast) } # create distribution for each of our experts revoFcast = NA for (i in 1:nrow(expertOpinion)) { if (is.na(revoFcast)) {revoFcast <- orderedTri(i,trials)} else revoFcast <-c(revoFcast,orderedTri(i,trials)) } # prepare the results revoFcast <-(data.frame(revoFcast)) names(revoFcast) <- paste("Expert", 1:nrow(expertOpinion), sep="") # ensure that the results are uncorrelated cor(revoFcast) # create a combined probability distribution and select a forecast value from the prob weighted dist combinedDist <- function(trialNum) { cDist <- DiscreteDistribution(supp = as.double(revoFcast[trialNum,]), prob = expertOpinion$Weighting/sum(expertOpinion$Weighting)) rD <- r(cDist) # variable to generate values from the dist return(rD(1)) # generate/select 1 value } merged <- rxExec(FUN = combinedDist, trialNum = rxElemArg(c(1:trials)), execObjects = c("revoFcast","expertOpinion"), packagesToLoad = "distr") # add the forecast to our working data set merged <- data.frame(merged) names(merged) <- NULL revoFcast$merged <- t(merged) # chart the output View(revoFcast) # Look at our combined data set # restructure the data for plotting histVals <- data.frame(Value = c(revoFcast$Expert1, revoFcast$Expert2, revoFcast$Expert3, revoFcast$merged), Source = c(rep(c("Expert1", "Expert2", "Expert3","Merged Opinion"), each = trials ))) names(histVals) = c("Value", "Source") # draw our combined plot ggplot(histVals, aes(Value, fill = Source)) + geom_density(alpha = 0.25) + ggtitle("Combined Expert Opinion")

*Download Expert Estimates the small data file used to drive Michael's simulation.*

**leave a comment**for the author, please follow the link and comment on their blog:

**Revolutions**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...