Monthly Archives: July 2012

Health Care Costs – Part 1, "The Problem"

July 5, 2012
By
Health Care Costs – Part 1, "The Problem"

The Problem In the United States, health care costs have been going up for a number of years, even when adjusted for inflation. Not unlike a runaway freight train, this rampant inflation cannot continue indefinitely without crashing. ...

Read more »

New R User Group in Leipzig, Germany

July 5, 2012
By

Leipzig R Statistical Computing is the sixth local R user group in Germany, and has been holding meetings since February. In the next meeting on July 12, member Claudia Beleites will talk about her pacakges softclassval (for classifier performance measures) and hyperspec (for hyperspectral data). meetup.com: Leipzig R Statistical Computing

Read more »

Validating email adresses in R

July 5, 2012
By

I currently program an automated report generation in R – participants fill out a questionnaire, and they receive a nicely formatted pdf with their personality profile. I use knitr, LaTex, and the sendmailR package. Some participants did not provide valid email addresses, which caused the sendmail function to crash. Therefore I wanted some validation of

Read more »

A tiny RCurl headache ;)

July 4, 2012
By

As more and more data go online (plus we love Google Drive) we are forced to connect to our data over the net. We mostly do this via RCurl (but we could do this using RGoogleDocs as well).In that case all that is required to get the data into R is the two lines of

Read more »

A tiny RCurl headache ;)

July 4, 2012
By

As more and more data go online (plus we love Google Drive) we are forced to connect to our data over the net. We mostly do this via RCurl (but we could do this using RGoogleDocs as well).In that case all that is required to get the data into R is the two lines of ...read more

Read more »

A new open journal on Data Science

July 4, 2012
By

Springer has introduced a new open, peer-reviewed journal focused on Data Science: EPJ Data Science. What makes this a Data Science journal is novel uses of statistics, data analysis, computer techniques and public data sources to research a topic in another domain, rather than methodological research. Here are a few examples of the papers you'll find in the journal:...

Read more »

Alternative to Monte Carlo Testing

July 4, 2012
By
Alternative to Monte Carlo Testing

When we backtest a strategy on a portfolio, it is a simple analysis of a single period in time. There are ways to “stress test” a strategy such as monte carlo, random portfolios, or shuffling the returns in a random order. I could never really wrap my head around monte carlo and shuffling the returns … Continue reading...

Read more »

Three Questions about a Matrix of Coefficient Plots

July 4, 2012
By

It's Independence Day in the U.S., so I am taking the day off, but I received the following request for advice and thought I'd pass it along to my readers. I wonder if you could help – I am trying to create 9 different coefficient plots , which repr...

Read more »

A tutorial on outlier detection techniques

July 4, 2012
By
A tutorial on outlier detection techniques

by Yanchang Zhao, RDataMining.com There is an excellent tutorial on outlier detection techniques, presented by Hans-Peter Kriegel et al. at ACM SIGKDD 2010. It presents many popular outlier detection algorithms, most of which were published between mid 1990s and 2010, … Continue reading →

Read more »

The Higgs boson: 5-sigma and the concept of p-values

July 4, 2012
By
The Higgs boson: 5-sigma and the concept of p-values

Why are physicists talking about 5-sigma, and what's it got to do with statistics? In this short post I'll explain what 5-sigma is and why it's not a measure of how certain scientist are that they've found the Higgs boson

Read more »