Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

###### The Situation

You are a consultant who has been hired by a business that sells one commodity product. On December 31st the price is $100 per unit. The business owner wants to know what to expect by the end of January. Your client gave you the message: • Prices are based off the the sales the previous day • Roughly 95% of the time, the price will be +/-$10 compared to the day before

With only a few minutes to make the call, how would you decide on what to expect for the end of January?

Simple:
Expected Value = $100 Wouldn’t that make sense? You could give yourself a pat on the back and walk away. However, because you want to be hired again, you want to provide your client options and a range of what to expect. The inner nerd in you screams out, “I want to simulate price changes for the month!” You make the rather large assumption that the daily changes are normally distributed. So you write some code based off of your assumptions and create a basic simulation based off of Brownian Motion. set.seed(5) initialPrice = 100 dailyPlusMinus = 10 dailyDeviation = dailyPlusMinus/2 # assumes 2*sigma roughly approximates 95% range on normal dist. #Note: This isn't the proper way to calculate a standard deviation. We really have no information on how the +/- was calculated, so it's the best you can do. n = 31 #number of days in January N = 500 #number of simulations prices = matrix(ncol=N,nrow=n) for(i in 1:N){ prices[,i] = initialPrice + cumsum(rnorm(n = n,mean = 0,sd = dailyDeviation)) } steps = 1:nrow(prices) yLimits = c(initialPrice-dailyDeviation*n/1.5, initialPrice+dailyDeviation*n/1.5) plot(steps,prices[,1],type='l', ylim=yLimits,xlab='Days', ylab='Daily Price ($)',
main='Simulation of Daily Prices for the Month')
for(i in 2:ncol(prices)){
lines(prices[,i])
}


After plotting the data, you quickly notice that you still don’t quite have the answer you’re looking for, but you do have a great visualization.

You quickly plot a histogram of the simulated end of month prices.

endOfMonthPrices = prices[n,]
hist(endOfMonthPrices,main='Histogram End of Month Price',
xlab='Price ($)',30)  You also visualize your data by looking at a CDF plot. plot(ecdf(endOfMonthPrices), main='CDF of End of Month Price', xlab='Price ($)')


Eyeballing the charts, it appears as if the majority of your data suggests the price will wind up between $50 and$150. You finally decide to pull the hard numbers.

##  1%  5% 10% 25% 33% 50% 66% 75% 90% 95% 99%
##  41  57  66  82  88  99 111 119 133 141 162

decis = c(0.01,0.05,0.10,0.25,0.33,
0.5,0.66,0.75,0.90,0.95,0.99)
deciPrices = round(quantile(endOfMonthPrices,
decis),0)
print(deciPrices)


You check to see if the results make sense:
YES – The 50% decile value is right around $100 YES – The deciles surrounding 50% are roughly equal (i.e. 1% =$41 and 99% = $162 are each roughly$60 away from $99) You understand what you have here, but it may not be as intuitive for your client. Someone could look at it and say, “there’s a 1% chance it will be$41 and a 99% chance it will be $162” which is incorrect. With your level of experience, you understand it’s wise to say: • The most likely price is going to be near$100
• The end of month price almost certainly going to fall between $160 and$40
• Confidence intervals are available upon request

The messaging and communication will 100% be based upon the client. In this case, it’s assuming someone doesn’t have any sort of stats background and just wants a high-level view.

Obviously, the majority of this analysis was completely unnecessary. However, that wouldn’t have been as much fun and you wouldn’t have learned as much! It’s nice to have something like this in your back pocket when you begin to analyze an increasing the number of variables, variance, and outcomes.

Code used in this post is on my GitHub