I am starting to take part at different competitions at kaggle and crowdanalytics. The goal of most competitions is to predict a certain outcome given some covariables. It is a lot of fun trying out different methods like random forests, boosted ...

What does a generalized linear model do? R supplies a modeling function called glm() that fits generalized linear models (abbreviated as GLMs). A natural question is what does it do and what problem is it solving for you? We work some examples and place generalized linear models in context with other techniques.For predicting a categorical Related posts:

Dealing with endogeneity in a binary dependent variable model requires more consideration than the simpler continuous dependent variable case. For some, the best approach to this problem is to use the same methodology used in the continuous case, i.e. 2 stage least squares. Thus, the equation of interest becomes a linear probability model (LPM). The

I unfortunately was not there, but we can vicariously enjoy it via the presentations that are posted on the conference website. Below is my take on the highlights (in chronological order). Peter Carl and Brian Peterson “Constructing Strategic Hedge Fund Portfolios” is wonderful from my perspective. Promoting random portfolios is sure to win my heart. … Continue reading...

In case you missed them, here are some articles from June of particular interest to R users. The Environmental Performance Index website uses R to rank countries by measures like environmental health and ecosystem vitality. A log-linear regression in R predicted the gold-winning Olympic 100m sprint time to be 9.68 seconds (it was actually 9.63 seconds). Some R-related talks...

Ricky Ho has created a reference a 6-page PDF reference card on Big Data Machine Learning, with examples implemented in the R language. (A free registration to DZone Refcardz is required to download the PDF.) The examples cover: Predictive modeling overview (how to set up test and training sets in R) Linear regression (using lm) Logistic regression (using glm)...

What's the one thing we need to do?Marketing researchers are asked this question frequently whenever they analyze customer satisfaction data. A company wishing to increase sales or limit churn wants to focus only on the most important determinants of those outcomes. Given the limitations imposed by the available customer survey data, this strategic question is transformed quickly into a methodological one concerning how...