The github page for the APM exercises has been updated with three new files for Chapters 6-8 (the section on regression). The classifications section is in-progress. Here's one of our fancy-pants graphs:

Recursive partitioning is a fundamental tool in data mining. It helps us explore the structure of a set of data, while developing easy to visualize decision rules for predicting a categorical (classification tree) or continuous (regression tree) outcome. Classification and regression trees can be generated through the rpart package. The post Classification and Regression Trees using R appeared first on...

In exercise 61.1 the problem is that the model has bad mixing. In the SAS manual the mixing is demonstrated after which a modified distribution is used to fix the model.In this post the same problem is tackled in R; MCMCpack, RJags, RStan and LaplaceDemon. MCMCpack has quite some mixing problems, RStan seems to do best.DataTo quote the SAS...

Introduction My statistics education focused a lot on normal linear least-squares regression, and I was even told by a professor in an introductory statistics class that 95% of statistical consulting can be done with knowledge learned up to and including a course in linear regression. Unfortunately, that advice has turned out to vastly underestimate the

Linear models are a very simple statistical techniques and is often (if not always) a useful start for more complex analysis. It is however not so straightforward to understand what the regression coefficient means even in the most simple case when there are no interactions in the model. If we are not only fishing for

In this post I will try to copy the calculations of SAS's PROC MCMC example 61.5 (Poisson Regression) into the various R solutions. In this post Jags, RStan, MCMCpack, LaplacesDemon solutions are shown. Compared to the first post in this series, rcppbugs and mcmc are not used. Rcppbugs has no poisson distribution and while I know how to...

I just finished covering a few numerical techniques for solving systems of equations, which can be applied to find best-fit lines through a give set of data points. The four points are arranged into an inconsistent system of four equations and two unknowns: The system can be represented in matrix form: The least-squares solution vector can be … Continue reading...

I'd like to introduce a package that simulates regression models. This includes both single level and multilevel (i.e. hierarchical or linear mixed) models up to two levels of nesting. The package produces a unified framework to simulate all types of c...

Users new to the Rcpp family of functionality are often impressed with the performance gains that can be realized, but struggle to see how to approach their own computational problems. Many of the most impressive performance gains are demonstrated with seemingly advanced statistical methods, advanced C++–related constructs, or both. Even when users are able to understand how various demonstrated features operate in isolation, examples...