490 search results for "evaluation"

Hypothesis-Driven Development Part II

September 8, 2015
By
Hypothesis-Driven Development Part II

This post will evaluate signals based on the rank regression hypotheses covered in the last post. The last time around, … Continue reading →

Read more »

Logistic Regression in R – Part Two

September 2, 2015
By
Logistic Regression in R – Part Two

My previous post covered the basics of logistic regression. We must now examine the model to understand how well it fits the data and generalizes to other observations. The evaluation process involves the assessment of three distinct areas – goodness of fit, tests of individual predictors, and validation of predicted values – in order to

Read more »

How do you know if your model is going to work? Part 1: The problem

September 2, 2015
By
How do you know if your model is going to work? Part 1: The problem

Authors: John Mount (more articles) and Nina Zumel (more articles). “Essentially, all models are wrong, but some are useful.” George Box Here’s a caricature of a data science project: your company or client needs information (usually to make a decision). Your job is to build a model to predict that information. You fit a model, … Continue reading...

Read more »

Predicting Titanic deaths on Kaggle IV: random forest revisited

August 23, 2015
By
Predicting Titanic deaths on Kaggle IV: random forest revisited

On July 19th I used randomForest to predict the deaths on Titanic in the Kaggle competition. Subsequently I found that both bagging and boosting gave better predictions than randomForest. This I found somewhat unsatisfactory, hence I am now revisi...

Read more »

Is Bayesian A/B Testing Immune to Peeking? Not Exactly

August 20, 2015
By
Is Bayesian A/B Testing Immune to Peeking? Not Exactly

Since I joined Stack Exchange as a Data Scientist in June, one of my first projects has been reconsidering the A/B testing system used to evaluate new features and changes to the site. Our current approach relies on computing a p-value to measure our confidence in a new feature. Unfortunately, this leads to a common pitfall in performing A/B...

Read more »

Evaluating Logistic Regression Models

August 17, 2015
By
Evaluating Logistic Regression Models

Logistic regression is a technique that is well suited for examining the relationship between a categorical response variable and one or more categorical or continuous predictor variables. The model is generally presented in the following format, where β refers to the parameters and x represents the independent variables. log(odds)=β0+β1∗x1+...+βn∗xn The log(odds), or log-odds ratio, is defined

Read more »

How Do You Know if Your Data Has Signal?

August 10, 2015
By
How Do You Know if Your Data Has Signal?

Image by Liz Sullivan, Creative Commons. Source: Wikimedia An all too common approach to modeling in data science is to throw all possible variables at a modeling procedure and “let the algorithm sort it out.” This is tempting when you are not sure what are the true causes or predictors of the phenomenon you are … Continue reading...

Read more »

Predicting Titanic deaths on Kaggle III: Bagging

August 9, 2015
By
Predicting Titanic deaths on Kaggle III: Bagging

This is the third post on prediction the deaths. The first one used randomforest, the second boosting (gbm). The aim of the third post was to use bagging. In contrast to the former posts I abandoned dplyr in this post. It gave some now you see now you ...

Read more »

Sensemaking in R: A Plenitude of Models Makes for Good Storytelling

August 3, 2015
By
Sensemaking in R: A Plenitude of Models Makes for Good Storytelling

"Sensemaking is a motivated, continuous effort to understand connections (which can be among people, places, and events) in order to anticipate their trajectories and act effectively."- Gary Klein, Brian Moon & Robert HoffmanMaking Sense of Sensema...

Read more »

Hockey Elbow and Other Response Time Injuries

July 29, 2015
By
Hockey Elbow and Other Response Time Injuries

You've heard of tennis elbow. Well, there's a non-sports, performance injury that I like to call hockey elbow. An example of such an "injury" is shown in Figure 1, which appeared in a recent computer performance analysis presentation. It's a reminder of how easy it is to become complacent when doing performance analysis and possibly end up reaching...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)