The following R code models a censored dependent variable (in this case academic aptitude) using a traditional least squares, tobit, and Bayesian approaches. As depicted below, the OLS estimates (blue) for censored data are inconsistent and will ...

posterior = (likelihood x prior) / integrated likelihoodThe combination of a prior distribution and a likelihood function is utilized to produce a posterior distribution. Incorporating information from both the prior distribution and the likelihood function leads to a reduction in variance and an improved estimator. As n→...

MapReduce is a powerful programming framework for efficiently processing very large amounts of data stored in the Hadoop distributed filesystem. But while several programming frameworks for Hadoop exist, few are tuned to the needs of data analysts who typically work in the R environment as opposed to general-purpose languages like Java. That's why the dev team at Revolution Analytics...

At last month's useR! 2011 conference at Warwick University, there were two talks on the RevoScaleR package for big data statistics in R. The first was a keynote presentation from Revolution Analytics' Chief Scientist, Lee Edlefsen. Here is the overview of his talk, Scalable Data Analysis in R: For the past several decades the rising tide of technology --...

Salesforce.com has become one of the most successful cloud applications. I am quite astounded by it’s mega hit penetration into myriad of industries. It is being used by leading organizations not only to implement their customer relationship management system but also to develop their own applications running on cloud. But complete absence of meaningful analytical

Today we wish to see how our model would have faired forecasting the past 20 values of GDP. Why? Well ask yourself this: How can you know where your going, if you don't know where you've been? Once you understand please proceed on with the following post.First recall the trend portion that we have already accounted for:> t=(1:258)> t2=t^2> trendy= 892.656210 +...

What does beta look like in the out-of-sample period for the portfolios generated to have beta equal to 1? In the comments Ian Priest wonders if the results in “The effect of beta equal 1″ are due to a shift in beta from the estimation period to the out-of-sample period. (The current post will make … Continue reading...

<< My review of Day 1. I am summarizing all of the days together since each talk was short, and I was too exhausted to write a post after each day. Due to the broken-up schedule of the KDD sessions, I group everything together instead of switching back and forth among a dozen different topics. By far the most enjoyable...