This blog is about the art of exploratory data analysis, which is also the subject of my new book, Exploring Data in Engineering, the Sciences, and Medicine (http://www.oup.com/us/ExploringData). This art is appropriate in situations where y...

On the first article, we saw a quick-and-dirty method to predict disk space exhaustion when the usage pattern is rigorously linear. We did that by importing our data into R and making a linear regression.In this article we will see the problems with that method, and deploy a more robust solution. Besides robustness, we will also see how we can generate...

Here is a quick analysis of the relationship between SAT score and student retention. The data is from the Integrated Postsecondary Education Data System (IPEDS) and analyzed using R. This was a quick analysis and would be careful about making any strong conclusions. The source for running this analysis along with some additional graphics that

Today I decided to begin more with visualizations and less with basic statistical analysis for sabermetrics using R. I'm not really here to teach the ins and outs of regressions and statistical tests, so once I get there, I'm hoping that those who have read this already have a decent understanding of those subjects before implementing them. ...

An announcement for two short-courses on Introduction to Bayesian Analysis and MCMC, and Hierarchical Modelling of Spatial and Temporal Data by Alan Gelfand (Duke University, USA) and Sujit Sahu (University of Southampton, UK), are to take place in Southampton on June 7-10, this year. Course 1: Introduction to Bayesian Analysis and MCMC. Date: June 7,

It is ever so easy to make blunders when doing quantitative finance. Very popular with novices is to analyze prices rather than returns. Regression on the prices When you want returns, you should understand log returns versus simple returns. Here we will be randomly generating our “returns” (with R) and we will act as if … Continue reading...