Numbers are useful (I think we can all agree on that..). If you own a smart phone, you can install this runmeter app. When you run, you can take the smartphone with you and activate this app to collect interesting … Continue reading →
Numbers are useful (I think we can all agree on that..). If you own a smart phone, you can install this runmeter app. When you run, you can take the smartphone with you and activate this app to collect interesting … Continue reading →
The best part about writing a dissertation is finding clever ways to procrastinate. The motivation for this blog comes from one of the more creative ways I’ve found to keep myself from writing. I’ve posted about data mining in the past and this post follows up on those ideas using a topic that is relevant 
The R Graph Gallery has been a popular website for many years now. The number of graphics keeps growing as people send me their code. When browsing the website with a mobile device the experience was frustrating, as too much … Continue reading →
How has the distribution of correlations changed over the last several years? Previously Posts about correlation boxplots explained Data Daily returns of 443 large cap US stocks from 2004 through 2012 were used. The sample correlations — almost 98,000 of them — during each year were created. If we were actually using the correlations, then … Continue reading...
JW Emerson, WA Green, B Schloerke, J Crowley, D Cook, H Hofmann, H Wickham (2013) The Generalized Pairs Plot. Journal of Computational and Graphical Statistics 22(1). Here's a free preprint version. Until this new paper and implementation by Emerson et al., there were no widely available pairs plots that accommodated both numerical and categorical fields.
Benford’s law is nowadays extremely popular (see e.g. http://en.wikipedia.org/…). It is usually claimed that, for a given set data set, changing units does not affect the distribution of the first digit. Thus, it should be related to scale invariant distributions. Heuristically, scale (or unit) invariance means that the density of the measure (or probability function) should be proportional to...
If you're laying down a friendly bet on the March Madness games or just tweaking your fantasy roster, this NCAA Data Visualizer by Rodrigo Zamith will be a boon. Just choose two teams to compare head-to-head, choose an attribute to compare them on. You can look at more than a dozen invividual player attributes (e.g. points scored, assists, 3-point...