1543 search results for "Regression"

Minimizing Bias in Observational Studies

November 26, 2012
By
Minimizing Bias in Observational Studies

Measuring the effect of a binary treatment on a measured outcome is one of the most common tasks in applied statistics. Examples of these applications abound, like the effect of smoking on health, or the effect of low birth weight on cognitive development. In an ideal world we would like to be able to assign

Read more »

The perks (and quirks) of being a referee

November 25, 2012
By

The other day I was talking to a friend at work, who was rather annoyed that one of his papers had been rejected by a journal, given the negative comments of the reviewers. This is, of course, part of the game, so you don't really get annoyed just because a paper get rejected. From what I hear, though, I...

Read more »

Data types, part 3: Factors!

November 21, 2012
By
Data types, part 3: Factors!

In this third part of the data types series, I'll go an important class that I skipped over so far: factors.Factors are categorical variables that are super useful in summary statistics, plots, and regressions. They basically act like dummy variables t...

Read more »

The Hour of Hell of Every Morning – Commute Analysis, April to October 2012

November 19, 2012
By
The Hour of Hell of Every Morning – Commute Analysis, April to October 2012

IntroductionSo a little while ago I quit my job.Well, actually, that sounds really negative. I'm told that when you are discussing large changes in your life, like finding a new career, relationship, or brand of diet soda, it's important to frame things positively.So let me rephrase that - I've left job I previously held to pursue other directions. Why?...

Read more »

A Shiny new way of communicating Bayesian statistics

November 19, 2012
By
A Shiny new way of communicating Bayesian statistics

Bayesian data analysis follows a very simple and general recipe: Specify a model and likelihood, i.e. what process do you think is generating your data? Specify a prior distribution, i.e. quantify what you know about a problem before having seen … Continue reading →

Read more »

The Heteroskedastic Probit Model

November 19, 2012
By
The Heteroskedastic Probit Model

Specification testing is an important part of econometric practice. However, from what I can see, few researchers perform heteroskedasticity tests after estimating probit/logit models. This is not a trivial point. Heteroskedasticity in these models can represent a major violation of the probit/logit specification, both of which assume homoskedastic errors. Thankfully, tests for heteroskedasticity in these

Read more »

GEE QIC update

November 15, 2012
By
GEE QIC update

Here is improved code for calculating QIC from geeglm in geepack in R (original post). Let me know how it works. I haven’t tested it much, but is seems that QIC may select overparameterized models. In the code below, I … Continue reading →

Read more »

Webinar Tomorrow: Big Data Trees and Hadoop Connection in Revolution R Enterprise 6.1

November 14, 2012
By

Tomorrow at 9AM Pacific, Revolution Analytics VP of Product Development Sue Ranney will introduce two key Big Data features of the new Revolution R Enterprise 6.1. Now you can train classification and regression trees on data sets of unlimited size, quickly and using the resources of multiple processors and clusters. (This white paper describes our implementation of tree models...

Read more »

How the Democrats may have won the House, but lost the seats

November 14, 2012
By
How the Democrats may have won the House, but lost the seats

  The 2012 election is over and in the books. A few very close races remain to be officially decided, but for the most part everything has settled down over the last week. By all accounts it was a very good night for the Democrats, with wins in the presidency, senate and state houses. They also performed

Read more »

Trees with the rpart package

November 13, 2012
By
Trees with the rpart package

What are trees? Trees (also called decision trees, recursive partitioning) are a simple yet powerful tool in predictive statistics. The idea is to split the covariable space into many partitions and to fit a constant model of the response variable in each partition. In case of regression, the mean...

Read more »