1536 search results for "Regression"

Error metrics for multi-class problems in R: beyond Accuracy and Kappa

July 6, 2012
By
Error metrics for multi-class problems in R: beyond Accuracy and Kappa

The caret package for R provides a variety of error metrics for regression models and 2-class classification models, but only calculates Accuracy and Kappa for multi-class models.  Therefore, I wrote the following function to allow caret:::train t...

Read more »

A better ‘nls’ (?)

July 5, 2012
By
A better ‘nls’ (?)

Those that do a lot of nonlinear regression will love the nls function of R. In most of the cases it works really well, but there are some mishaps that can occur when using bad starting values for the parameters. One of the most dreaded is the “singular gradient matrix at initial parameter estimates” which

Read more »

Glmnet_1.8 uploaded to CRAN

July 4, 2012
By

(by Trevor Hastie) Glmnet_1.8 uploaded to CRAN – This is a major revision, with two additional models included. 1) Multiresponse regression – family=”mgaussian” Here we have a matrix of M responses, and we fit a series of linear models in parallel. We use a group-lasso penalty on the set of M coefficients for each variable. This means they are...

Read more »

Example of Factor Attribution

July 3, 2012
By
Example of Factor Attribution

In the prior post, Factor Attribution 2, I have shown how Factor Attribution can be applied to decompose fund’s returns in to Market, Capitalization, and Value factors, the “three-factor model” of Fama and French. Today, I want to show you a different application of Factor Attribution. First, let’s run Factor Attribution on each the stocks

Read more »

Citing R or SAS

July 2, 2012
By
Citing R or SAS

One of us recently read a colleague's first draft of a paper, in which she had written: "All analyses were done in R 2.14.0." We assume we're preaching to the converted here, when we say that the enormous amount of work that goes into R needs to be re...

Read more »

My first competition at Kaggle

July 2, 2012
By
My first competition at Kaggle

For me Kaggle becomes a social network for data scientist, as stackoverflow.com or github.com for programmers. If you are data scientist, machine learner or statistician you better off to have a profile there, otherwise you do not exist. Nevertheless, I won’t bet on rosy future for data scientist as journalists suggest (sexy job for next

Read more »

Moving beyond hopeless graphics

July 2, 2012
By

I was at a talk awhile ago where the speaker presented tables with 4, 5, 6, even 8 significant digits even though, as is usual, only the first or second digit of each number conveyed any useful information. A graph would be better, but even if you’re too lazy to make a plot, a bit The post Moving...

Read more »

Modeling Trick: Masked Variables

July 1, 2012
By
Modeling Trick: Masked Variables

A primary problem data scientists face again and again is: how to properly adapt or treat variables so they are best possible components of a regression. Some analysts at this point delegate control to a shape choosing system like neural nets. I feel such a choice gives up far too much statistical rigor, transparency and Related posts:

Read more »

Coefficient Plots in R

June 30, 2012
By

One popular trend in presenting results is the "coefficient plot," an alternative to the table of regression coefficients. I am seeing this a little more often in political science research and have received a few requests for code, so I … Contin...

Read more »

Rcpp 0.9.13

June 29, 2012
By

The bug-fix in version 0.9.12 of Rcpp turned out to be incomplete, so a new version 0.9.13 is now on CRAN and will get to Debian shortly. The Rcpp::Enviroment constructor is now properly fixed (using the global environment as a default value). As ...

Read more »