What makes a hockey Hall-of-Famer?

August 9, 2011
By
What makes a hockey Hall-of-Famer?

At the JSM conference last week, I stopped by a great poster by Steve Salaga and Brian Mills, graduate students at University of Michigan's Department of Sport Management. The guys were clearly hockey fans, and had channelled their enthusiasm for a sport into an interesting statistical analysis of game and player data from the NHL. One analysis, based on...

Read more »

Estimate decay of linkage disequilibrium with distance

August 9, 2011
By
Estimate decay of linkage disequilibrium with distance

It is well known that linkage disequilibrium (LD) decays with distance. Several functions have been proposed to estimate such decay. Among the most widely used are the Hill and Weir (1) formula for describing the decay of r2 and a formula proposed by Abecasis (2) for describing the decay of D’. I wrote R functions

Read more »

Forecasting recessions

August 9, 2011
By
Forecasting recessions

John Hussman has a Recession Warning Composite that I am attempting to replicate/improve. The underlying data seems to be easy enough to get from FRED using the quantmod package in R. I don't quite understand the index Hussman is using for commercial...

Read more »

The indices understate the carnage

August 9, 2011
By
The indices understate the carnage

The first 6 trading days of August have been bad for the major indices, but how variable is that across portfolios? To answer that, two sets of random portfolios were generated from the constituents of the S&P 500.  The trading days are 2011 August 1 — 5 and 8. The returns of the indices for … Continue reading...

Read more »

Blog planets are like conferences… (aka R-bloggers.com)

August 8, 2011
By
Blog planets are like conferences… (aka R-bloggers.com)

Blog planets are websites that aggregate blog feeds around a particular topic or project. It is probably called after one of its first implementations, the Planet software. These planets are like conferences, rather than journals. Like conferences with...

Read more »

Installing Rmpi with OpenMPI on Mac OS X Lion

August 8, 2011
By

For whatever reason, Apple decided not to include OpenMPI in Mac OS X Lion (it was supported in Leopard and Snow Leopard). I found this out the hard way after doing a clean install of Lion. Here are steps to install OpenMPI and get it working with the Rmpi package in R. One benefit of

Read more »

How ANZ uses R for credit risk analysis

August 8, 2011
By
How ANZ uses R for credit risk analysis

At last month's R user group meeting in Melbourne, the theme was "Experiences with using SAS and R in insurance and banking". There, Hong Ooi from ANZ (Australia and New Zealand Banking Group) gave a presentation on "Experiences with using R in credit risk". I didn't get to see the presentation myself, but the slides tell a great story...

Read more »

FII and DII turnover with effect on Nifty Downloader

August 8, 2011
By
FII and DII turnover with effect on Nifty Downloader

My thirst for statistics has been increasing. IV had another requirement, which would eventually be useful to me as well. He currently downloads FII and DII buy and sell values and its impact on Nifty manually in Excel. He suggested me to try and autom...

Read more »

Power of running world records

August 8, 2011
By
Power of running world records

Followinga few entries on sports here and there, I was wondering what kind of law follow the running records with respect to the distance. The data are available on Wikipedia, or here for a tidied version. It collects 18 distances, from 100 meters to 100 kilometers. A log-log scale is in order: It is nice

Read more »

Slides from Rocky Mtn SABR Meeting

August 8, 2011
By
Slides from Rocky Mtn SABR Meeting

Last Saturday I had the good fortune to present a talk on finding, gathering, and analyzing some sports-related data on the web at the local SABR group meeting.  In case you’re not familiar with the “SABR” acronym, it stands for … Continue reading →

Read more »

Two-Way PERMANOVA (with Vegan-Function adonis) Using Customized Contrasts

August 8, 2011
By
Two-Way PERMANOVA  (with Vegan-Function adonis) Using Customized Contrasts

...say you have a multivariate dataset and a two-way factorial design - you do a PERMANOVA and the aov-table (adonis is using ANOVA or "sum"-contrasts) tells you there is an interaction - how to proceed when you want to go deeper into the ana...

Read more »

The Open Governing Index: How open is the R project?

August 8, 2011
By

The Open Governing Index is a new measure developed by VisionMobile, that rates open-source projects regarding their governance process. The index has four facets, described thoroughly in the "Open Governance Index" publication, and briefly below. access - These criteria assess the availability of source code, a permissive license, developer support mechanisms, a roadmap, and openness

Read more »

Win-Vector starts submitting content to r-bloggers.com

August 8, 2011
By

We have been consistently impressed by and enjoyed the wealth of R wisdom available on the R-bloggers aggregation site. Therefore Win-Vector LLC is granting the right to reformat and redistribute (with attribution and link) our blog‘s R content in the R-bloggers site and feeds. We hope to see our R content shared through this network. Related posts:

Read more »

(#ESA11) rOpenSci: a collaborative effort to develop R-based tools for facilitating Open Science

August 8, 2011
By
(#ESA11) rOpenSci: a collaborative effort to develop R-based tools for facilitating Open Science

Our development team would like to announce the launch of rOpenSci. As the title states, this project aims to create R packages to make open science more available to researchers. http://ropensci.org/ What this means is t...

Read more »

Trading volume forecast for an illiquid stock

August 8, 2011
By
Trading volume forecast for an illiquid stock

When dealing with transaction cost analysis, a stock’s volume is assumed to be stable or foreseeable.  However, there is different picture, then we are dealing with an illiquid stock. It is relatively easy to forecast the volume of a liquid stock, because trading volume has high autocorrelation – the volumes at t and t+1 are correlated. For

Read more »

R at Wikimania

August 8, 2011
By

Wikimania 2011 came to a close yesterday. For those of you unfamiliar with Wikimania it may be described as UseR for Wikipedia, Wikimedia and MediaWiki all rolled into one. The conference brings together staff, volunteer editors, volunteer developers and users of MediaWiki projects. Of specific interest to R Bloggers readers may be the sessions on…

Read more »

RghcnV3 2.0

August 7, 2011
By
RghcnV3 2.0

Well, version 2.0 is in the can and I’ll be uploading to CRAN over the next couple of days. Lets go over the highlights. Prior to version 2.0 we had basically 3 kinds of data flowing around the package: V3 14 column format, zoo objects and mts objects.  The 14 column format has always been

Read more »

Meta-analysis

August 7, 2011
By
Meta-analysis

Introduction Effect estimation is an important task in modern research. An example is the identification of risk factors for disease and the qualification of medical treatments. Usually, researchers are interested in estimating the global, common effect. Since actual effects tend to differ across populations, estimates based on sample of a particular population seldomly generalize well.

Read more »

Usability

August 7, 2011
By

Usability. I am not an expert in Human-Computer Interaction (HCI) at all. Worse, I make the crappiest looking interfaces, typically. So, that's said. Usability. Wikipedia writes that "sability is the ease of use and learnability of a ...

Read more »

R popularity – steady growth and New York Times

August 6, 2011
By
R popularity – steady growth and New York Times

I have just came up with an idea how to test the wikipedia search traffic visualisation functions that I wrote about in my previous post. I decided to check if R is really gaining popularity that fast. ar <- wikiStat("R_(programming_language)", … Continue reading →

Read more »

Fitting mixture distributions with the R package mixtools

Fitting mixture distributions with the R package mixtools

My last two posts have been about mixture models, with examples to illustrate what they are and how they can be useful.  Further discussion and more examples can be found in Chapter 10 of Exploring Data in Engineering, the Sciences, and Medicine.  One important topic I haven’t covered is how to fit mixture models to datasets like the Old Faithful geyser...

Read more »

Visualising Wikipedia search statistics with R

August 6, 2011
By
Visualising Wikipedia search statistics with R

I have been playing with R to parse html. After reading about visualising “fantasy football” search traffic with RGoogleTrends at The Log Cabin blog I decided to write a few functions to do similar things with Wikipedia search statistics. This … Continue reading →

Read more »

Programmers Should Know R

August 6, 2011
By
Programmers Should Know R

Programmers should definitely know how to use R. I don’t mean they should switch from their current language to R, but they should think of R as a handy tool during development.Again and again I find myself working with Java code like the following. td.linenos { background-color: #f0f0f0; padding-right: 10px; } span.lineno { background-color: #f0f0f0; Related posts:

Read more »

Number of components in a mixture

August 5, 2011
By
Number of components in a mixture

I got a paper (unavailable online) to referee about testing for the order (i.e. the number of components) of a normal mixture. Although this is an easily spelled problem, namely estimate k in I came to the conclusion that it is a kind of ill-posed problem. Without a clear definition of what a component is,

Read more »

Upcoming R training classes, live from the experts

August 5, 2011
By

Revolution Analytics is hosting several hands-on R training classes over the next few months, with in-person instruction from two leading package authors and experts from the R community. Diethelm Würtz from ETH Zurich will give a two-day master class on Portfolio Selection and Optimization in Practice. Prof Würtz leads the Rmetrics project, and will provide in-depth instruction on using...

Read more »

R as a cure for ‘mindless statistics’?

August 5, 2011
By

Several years ago Gerd Gigerenzer wrote: “Statistical rituals largely eliminate statistical thinking in the social sciences. Rituals are indispensable for identification with social groups, but they should be the subject rather than the procedure of science. Statistical rituals largely eliminate … Continue reading →

Read more »

Positive coefficient regression in R

August 5, 2011
By
Positive coefficient regression in R

Ever have a regression model where the coefficients don't make sense? I've been trying to predict electricity and gas consumption from daily activity schedules but a simple linear regression kept saying that demands should go down the more an activity is performed. Fortunately I found the nnls package and show here how you can use it to...

Read more »

More on JSM

August 5, 2011
By

While my time at the 2011 Joint Statistical Meetings was short--I unfortunately missed some presentations I would have like to have attended--it was a great experience. The collection of academics and professionals is very different from the other con...

Read more »

Image Data from ImageJ to R and Vice Versa

August 5, 2011
By

In recent years many R packages have been developed to enable image analysis in R. As an alternative the combination of R with a powerful image analysis software like ImageJ offers many advanced image analysis interfaces and algorithms not yet available in R. Bio7 integrates both applications in a Rich Client Plattform based on Eclipse

Read more »