1682 search results for "regression"

Reserving with negative increments in triangles

April 11, 2013
By
Reserving with negative increments in triangles

A few months ago, I did published a post on negative values in triangles, and how to deal with them, when using a Poisson regression (the post was published in French). The idea was to use a translation technique: Fit a model not on ‘s but on , for some , Use that model to make predictions, and then...

Read more »

In case you missed it: March 2013 Roundup

April 10, 2013
By

In case you missed them, here are some articles from March of particular interest to R users. Facebook used R to analyze profile photo changes to create a map of same-sex marriage support in the USA. Joe Rickert contrasts random sampling with fitting models directly to large data sets. A presentation by Carlos Somohano summarizes the history, skills and...

Read more »

Gradient Boosting: Analysis of LendingClub’s Data

April 8, 2013
By
Gradient Boosting: Analysis of LendingClub’s Data

An old 5.75% CD of mine recently matured and seeing that those interest rates are gone forever, I figured I’d take a statistical look at LendingClub’s data. Lending Club is the first peer-to-peer lending company to register its offerings as securities with the Securities and Exchange Commission (SEC). Their operational statistics are public and available for download. The latest

Read more »

Mastering Matrices

April 7, 2013
By
Mastering Matrices

R has many ways to store information.  Most of the time, our data comes in the form of a dataset, which we bring into R as a data.frame object. However, there are times when we want to use matrices as well. This post will show you how matrices can...

Read more »

Worry about correctness and repeatability, not p-values

April 5, 2013
By
Worry about correctness and repeatability, not p-values

In data science work you often run into cryptic sentences like the following: Age adjusted death rates per 10,000 person years across incremental thirds of muscular strength were 38.9, 25.9, and 26.6 for all causes; 12.1, 7.6, and 6.6 for cardiovascular disease; and 6.1, 4.9, and 4.2 for cancer (all P < 0.01 for linear Related posts:

Read more »

An Introduction to SAS for R Programmers

April 4, 2013
By

by Joseph Rickert Life decisions are usually much too complicated to be attributed to any single cause, but one important reason that I am here at Revolution today is that I ignored suggestions from well-meaning faculty back in graduate school to work more in SAS rather than doing everything in R. There was a heavy emphasis on SAS then:...

Read more »

a brief on naked statistics

April 2, 2013
By
a brief on naked statistics

Over the last Sunday breakfast I went through Naked Statistics: Stripping the Dread from the Data. The first two pages managed to put me in a prejudiced mood for the rest of the book. To wit: the author starts with some math bashing (like, no one ever bothers to tell us about the uses of

Read more »

What’s New in Release 6.2: Additional ScaleR Features

April 2, 2013
By

by Thomas Dinsmore Revolution R Enterprise Release 6.2 is in track for General Availability on April 22. In previous posts, I've commented on support for open source R 2.15.3 and Stepwise Regression. Today I'll wrap this series with a summary of some of the other new features supported in this release. Parallel Random Number Generation For analysts seeking to...

Read more »

Introducing the healthvis R package – one line D3 graphics with R

April 2, 2013
By

We have been a little slow on the posting for the last couple of months here at Simply Stats. That’s bad news for the blog, but good news for our research programs! Today I’m announcing the new healthvis R package … Continue reading →

Read more »

p-values are (possibly biased) estimates of the probability that the null hypothesis is true

March 31, 2013
By
p-values are (possibly biased) estimates of the probability that the null hypothesis is true

Last week, I posted about statisticians’ constant battle against the belief that the p-value associated (for example) with a regression coefficient is equal to the probability that the null hypothesis is true, for a null hypothesis that beta is zero or negative. I argued that (despite our long pedagogical practice) there are, in fact, many

Read more »