1739 search results for "Regression"

Preferential attachment applied to frequency of accessing a variable

May 17, 2013
By
Preferential attachment applied to frequency of accessing a variable

If, when writing code for a function, up to the current point in the code distinct local variables have been accessed for reading times (), will the next read access be from a previously unread local variable and if not what is the likelihood of choosing each of the distinct variables (global variables are ignored

Read more »

Exponential Cache Behavior

May 15, 2013
By
Exponential Cache Behavior

Guerrilla alumnus Gary Little observed certain fixed-point behavior in simulations where disk IO blocks are updated randomly in a fixed size cache. For his python simulation with 10 million entries (corresponding to an allocation of about 400 MB of memory) the following results were obtained: Hit ratio (i.e., occupied) = 0.3676748 Miss ratio...

Read more »

In case you missed it: April 2013 Roundup

May 13, 2013
By

In case you missed them, here are some articles from April of particular interest to R users: A critique of a SAS whitepaper comparing the performance of SAS, R and Mahout. A video presentation from statistician Tess Nesbitt at UpStream, who uses GAM survival models in R for marketing attribution analysis. The April edition of the Revolution Analytics newsletter....

Read more »

Using C libraries in R with rdyncall

May 12, 2013
By
Using C libraries in R with rdyncall

One reason I like using R for data analysis is that R has a great collection of packages that let you easily apply state-of-the-art methods to your problems. But once in a while you find a library that you would like to use that does not have a R wrapper, yet. While the great Rcpp

Read more »

Reproducibility and randomness

May 11, 2013
By
Reproducibility and randomness

With Stéphane Tufféry, we were working this week on a chapter of a book, entitled Statistical Learning in Actuarial Science. The chapter should be based on R functions, and we wanted to reproduce some outputs he previously obtained with SAS. The good thing is that even complex functions (logistic regression, regression trees, etc) produce the same kind of outputs....

Read more »

Trevor Hastie presents glmnet: lasso and elastic-net regularization in R

May 9, 2013
By
Trevor Hastie presents glmnet: lasso and elastic-net regularization in R

by Joseph Rickert Even a casual glance at the R Community Calendar shows an impressive amount of R user group activity throughout the world: 45 events in April and 31 scheduled so far for May. New groups formed last month in Knoxville, Tennessee (The Knoxville R User Group: KRUG) and Sheffield in the UK (The Sheffield R Users). An...

Read more »

Feature Selection 2 – Genetic Boogaloo

May 8, 2013
By
Feature Selection 2 – Genetic Boogaloo

Previously, I talked about genetic algorithms (GA) for feature selection and illustrated the algorithm using a modified version of the GA R package and simulated data. The data were simulated with 200 non-informative predictors and 12 linear effects and three non-linear effects. Quadratic discriminant analysis (QDA) was used to model the data. The last set of...

Read more »

How to Calculate a Partial Correlation Coefficient in R: An Example with Oxidizing Ammonia to Make Nitric Acid

How to Calculate a Partial Correlation Coefficient in R: An Example with Oxidizing Ammonia to Make Nitric Acid

Introduction Today, I will talk about the math behind calculating partial correlation and illustrate the computation in R with an example involving the oxidation of ammonia to make nitric acid using a built-in data set in R called stackloss.  In a separate post, I will also share an R function that I wrote to estimate partial correlation.

Read more »

A pathological glm() problem that doesn’t issue a warning

May 1, 2013
By
A pathological glm() problem that doesn’t issue a warning

I know I have already written a lot about technicalities in logistic regression (see for example: How robust is logistic regression? and Newton-Raphson can compute an average). But I just ran into a simple case where R‘s glm() implementation of logistic regression seems to fail without issuing a warning message. Yes the data is a Related posts:

Read more »

SAS Big Data Analytics Benchmark (Part One)

April 30, 2013
By

by Thomas Dinsmore On April 26, SAS published on its website an undated Technical Paper entitled Big Data Analytics: Benchmarking SAS, R and Mahout. In the paper, the authors (Allison J. Ames, Ralph Abbey and Wayne Thompson) describe a recent project to compare model quality, product completeness and ease of use for two SAS products together with open source...

Read more »