1557 search results for "Regression"

Trading with Support Vector Machines (SVM)

November 30, 2012
By
Trading with Support Vector Machines (SVM)

Finally all the stars have aligned and I can confidently devote some time for back-testing of new trading systems, and Support Vector Machines (SVM) are the new “toy” which is going to keep me busy for a while. SVMs are a well-known tool from the area of supervised Machine Learning, and they are used both

Read more »

Another Way to Access R from Python – PypeR

November 29, 2012
By
Another Way to Access R from Python – PypeR

Different from RPy2, PypeR provides another simple way to access R from Python through pipes (http://www.jstatsoft.org/v35/c02/paper). This handy feature enables data analysts to do the data munging with python and the statistical analysis with R by passing objects interactively between two computing systems. Below is a simple demonstration on how to call R within Python

Read more »

bigglm on your big data set in open source R, it just works – similar as in SAS

bigglm on your big data set in open source R, it just works – similar as in SAS

In a recent post by Revolution Analytics (link & link) in which Revolution was benchmarking their closed source generalized linear model approach with SAS, Hadoop and open source R, they seemed to be pointing out that there is no 'easy' R open source solution which exists for building a poisson regression model on large datasets.    This post is about...

Read more »

OpenScoring: Open Source Scoring of PMML Models via REST

November 27, 2012
By
OpenScoring: Open Source Scoring of PMML Models via REST

The other day I stumbled accross an amazing PMML model API called jpmml.  It's written in Java and supports PMML 4.1 (and older).  Neural networks, random forests, regression and trees PMML models can be consumed and used for scoring.I decide...

Read more »

Minimizing Bias in Observational Studies

November 26, 2012
By
Minimizing Bias in Observational Studies

Measuring the effect of a binary treatment on a measured outcome is one of the most common tasks in applied statistics. Examples of these applications abound, like the effect of smoking on health, or the effect of low birth weight on cognitive development. In an ideal world we would like to be able to assign

Read more »

The perks (and quirks) of being a referee

November 25, 2012
By

The other day I was talking to a friend at work, who was rather annoyed that one of his papers had been rejected by a journal, given the negative comments of the reviewers. This is, of course, part of the game, so you don't really get annoyed just because a paper get rejected. From what I hear, though, I...

Read more »

Data types, part 3: Factors!

November 21, 2012
By
Data types, part 3: Factors!

In this third part of the data types series, I'll go an important class that I skipped over so far: factors.Factors are categorical variables that are super useful in summary statistics, plots, and regressions. They basically act like dummy variables t...

Read more »

The Hour of Hell of Every Morning – Commute Analysis, April to October 2012

November 19, 2012
By
The Hour of Hell of Every Morning – Commute Analysis, April to October 2012

IntroductionSo a little while ago I quit my job.Well, actually, that sounds really negative. I'm told that when you are discussing large changes in your life, like finding a new career, relationship, or brand of diet soda, it's important to frame things positively.So let me rephrase that - I've left job I previously held to pursue other directions. Why?...

Read more »

A Shiny new way of communicating Bayesian statistics

November 19, 2012
By
A Shiny new way of communicating Bayesian statistics

Bayesian data analysis follows a very simple and general recipe: Specify a model and likelihood, i.e. what process do you think is generating your data? Specify a prior distribution, i.e. quantify what you know about a problem before having seen … Continue reading →

Read more »

The Heteroskedastic Probit Model

November 19, 2012
By
The Heteroskedastic Probit Model

Specification testing is an important part of econometric practice. However, from what I can see, few researchers perform heteroskedasticity tests after estimating probit/logit models. This is not a trivial point. Heteroskedasticity in these models can represent a major violation of the probit/logit specification, both of which assume homoskedastic errors. Thankfully, tests for heteroskedasticity in these

Read more »