Monthly Archives: July 2012

A tiny RCurl headache ;)

July 4, 2012
By

As more and more data go online (plus we love Google Drive) we are forced to connect to our data over the net. We mostly do this via RCurl (but we could do this using RGoogleDocs as well).In that case all that is required to get the data into R is the two lines of ...read more

Read more »

A new open journal on Data Science

July 4, 2012
By

Springer has introduced a new open, peer-reviewed journal focused on Data Science: EPJ Data Science. What makes this a Data Science journal is novel uses of statistics, data analysis, computer techniques and public data sources to research a topic in another domain, rather than methodological research. Here are a few examples of the papers you'll find in the journal:...

Read more »

Alternative to Monte Carlo Testing

July 4, 2012
By
Alternative to Monte Carlo Testing

When we backtest a strategy on a portfolio, it is a simple analysis of a single period in time. There are ways to “stress test” a strategy such as monte carlo, random portfolios, or shuffling the returns in a random order. I could never really wrap my head around monte carlo and shuffling the returns … Continue reading...

Read more »

Three Questions about a Matrix of Coefficient Plots

July 4, 2012
By

It's Independence Day in the U.S., so I am taking the day off, but I received the following request for advice and thought I'd pass it along to my readers. I wonder if you could help – I am trying to create 9 different coefficient plots , which repr...

Read more »

A tutorial on outlier detection techniques

July 4, 2012
By
A tutorial on outlier detection techniques

by Yanchang Zhao, RDataMining.com There is an excellent tutorial on outlier detection techniques, presented by Hans-Peter Kriegel et al. at ACM SIGKDD 2010. It presents many popular outlier detection algorithms, most of which were published between mid 1990s and 2010, … Continue reading →

Read more »

The Higgs boson: 5-sigma and the concept of p-values

July 4, 2012
By
The Higgs boson: 5-sigma and the concept of p-values

Why are physicists talking about 5-sigma, and what's it got to do with statistics? In this short post I'll explain what 5-sigma is and why it's not a measure of how certain scientist are that they've found the Higgs boson

Read more »

Glmnet_1.8 uploaded to CRAN

July 4, 2012
By

(by Trevor Hastie) Glmnet_1.8 uploaded to CRAN – This is a major revision, with two additional models included. 1) Multiresponse regression – family=”mgaussian” Here we have a matrix of M responses, and we fit a series of linear models in parallel. We use a group-lasso penalty on the set of M coefficients for each variable. This means they are...

Read more »

To the Basics: Bayesian Inference on A Binomial Proportion

July 4, 2012
By
To the Basics: Bayesian Inference on A Binomial Proportion

Think of something observable – countable – that you care about with only one outcome or another. It could be the votes cast in a two-way election in your town, or the free throw shots the center on your favorite...

Read more »

Example of Factor Attribution

July 3, 2012
By
Example of Factor Attribution

In the prior post, Factor Attribution 2, I have shown how Factor Attribution can be applied to decompose fund’s returns in to Market, Capitalization, and Value factors, the “three-factor model” of Fama and French. Today, I want to show you a different application of Factor Attribution. First, let’s run Factor Attribution on each the stocks

Read more »

RcppBDT 0.2.0

A new release of the RcppBDT package appeared on CRAN earlier today.RcppBDT uses Rcpp, and in particular the nifty Rcpp modules feature of wrapping C++ code for R just by declaring the (class or function) interfaces. It uses this to bring in some useful functions from Boost Date.Time to R so that one can do things likeR> library(RcppBDT) R> sapply(2012:2016, function(year) +...

Read more »