Climate datasets in R

August 10, 2011
By

As an ecologist working on climate change questions, I’ve always found it rather tedious to acquire and process climate data, especially when dealing with large spatiotemporal scales. Although many agencies provide free access to climate data, there is often some overhead (typically one to two days) before the data are made available for download via

Read more »

By: Build multiarch R (32 bit and 64 bit) on Debian/Ubuntu

<p> have the 64 bit version of R compiled from source on my Ubuntu laptop. I recently had a need for R based on 32 bit since a package I </p>

Read more »

Error : package does not have a name space

August 10, 2011
By
Error : package does not have a name space

Here is a frustrating knot I am beginning to unravel. R has a namespace structure to help reduce errors resulting from functions, methods, objects, classes having the same name from two different packages. Makes sense. Those R objects that are … Continue reading →

Read more »

In case you missed it: July Roundup

August 10, 2011
By

In case you missed them, here are some articles from July of particular interest to R users. A simulation in R finds the value (or disadvantage) or drawing an X, J, Q or Z in Scrabble. How to display high-quality graphics on the web using SVG output from R. A review of Paul Murrell's talk about raster image support...

Read more »

Bayes factors and martingales

August 10, 2011
By
Bayes factors and martingales

A surprising paper came out in the last issue of Statistical Science, linking martingales and Bayes factors. In the historical part, the authors (Shafer, Shen, Vereshchagin and Vovk) recall that martingales were popularised by Martin-Löf, who is also influential in the theory of algorithmic randomness. A property of test martingales (i.e., martingales that are non

Read more »

SNA: Visualising an email box with R

August 10, 2011
By
SNA: Visualising an email box with R

Are statistics sexy? Visualising social networks certainly is! I wrote a little function, which makes producing beautiful plots depicting a mailbox with R an extremely easy task. I find visualisations of ‘social graphs’ particularly appealing. They look like flowers. I … Continue reading →

Read more »

Dump MySQL to CSV using R

August 10, 2011
By

Based on a related post on one of my favorite python-lists I remembered, that I wrote a similar snipplet some time ago. So if you want to dump your whole MySQL database to csv-files you can recycle the following code: ?Download mysql2cvs.R1 2 3 4 5 6 7 8 9 require(RMySQL) m<-MySQL() summary(m) con<-dbConnect(m, dbname

Read more »

Using the google prediction API from R

August 10, 2011
By
Using the google prediction API from R

Google has a "black box" prediction API that they provide for use with creating recommender systems or filtering spam. Furthermore, they provide an R package for interfacing that API, but try as I might I cannot get it to work under windows. Here are ...

Read more »

Plotting molecular properties for (sub)sets

August 10, 2011
By
Plotting molecular properties for (sub)sets

For a toxicology paper we are writing up, I need to create a few plots showing how the toxic and non-toxic molecules differ (or not) with respect to a few molecular properties, such as logP or the molecular weight. The rcdk package provides all, of cou...

Read more »

A 60-second survey for R users

August 10, 2011
By

I'm doing a little research to validate estimates of the size of the R user community. If you're an R user, please take a minute to complete this three-question survey on R usage at your organization. Thanks in advance. Revolution Analytics: R user base survey

Read more »

Informational Easing: A Change In F.O.M.C. Expectations

August 10, 2011
By
Informational Easing: A Change In F.O.M.C. Expectations

Let's analyze the latest FOMC policy move.The FOMC met yesterday and changed up the communications strategy.  How so? Well, until yesterday the statement has been saying as of June 22, 2011:"The Committee continues to anticipat...

Read more »

Scraping web data in R

August 10, 2011
By
Scraping web data in R

In my last post, I went through a lot of effort to scrape the PMI index off the ISM website.  It turns out that was unnecessary effort, as commentator "senne" pointed out that this index is available from FRED, with the symbol NAPM. &nbs...

Read more »

Using a “pure infographic” to explore differences between information visualization and statistical graphics

August 10, 2011
By
Using a “pure infographic” to explore differences between information visualization and statistical graphics

Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs The post Using...

Read more »

Multiple cores in R, revisited

August 10, 2011
By

The bigmemory package in combination with doMC provides at least a partial solution for sharing a large data set across multiple cores in R. With this solution you can work on the same matrix using several threads. It is also a very scalable solution. ...

Read more »

Coding, GUIs and Statistical Rituals

August 10, 2011
By
Coding, GUIs and Statistical Rituals

I was recently inspired to comment on this blog post, asking is R is a cure for ‘mindless statistics’. Anyone whose familiar with statistics used in applied fields like epidemiology, sociology, social sciences generally will be familiar with the idea of a ‘statistical ritual’. Rather than think about the proper statistical approach to every question,

Read more »

What do you want to see at useR 2012?

August 9, 2011
By

This year's useR! conference at Warwick University is less than a week away, but planning is already underway for useR! 2012, to be held at Vanderbilt University in Nashville. If you're planning to attend, conference organizer Frank Harrell is looking for your input: The 2012 R User Conference - useR! 2012 - will be held in Nashville Tennessee USA,...

Read more »

Amazon Machine Image Created With RTextTools Pre-installed

We recently created an AMI for Amazon's EC2 cloud computing service. Users with AWS accounts can access the public AMI by searching ami-817eb8e8. The AMI is based off of Drew Conway's excellent AMI, but with R 2.13 loaded and RTextTools and

Read more »

What makes a hockey Hall-of-Famer?

August 9, 2011
By
What makes a hockey Hall-of-Famer?

At the JSM conference last week, I stopped by a great poster by Steve Salaga and Brian Mills, graduate students at University of Michigan's Department of Sport Management. The guys were clearly hockey fans, and had channelled their enthusiasm for a sport into an interesting statistical analysis of game and player data from the NHL. One analysis, based on...

Read more »

Estimate decay of linkage disequilibrium with distance

August 9, 2011
By
Estimate decay of linkage disequilibrium with distance

It is well known that linkage disequilibrium (LD) decays with distance. Several functions have been proposed to estimate such decay. Among the most widely used are the Hill and Weir (1) formula for describing the decay of r2 and a formula proposed by Abecasis (2) for describing the decay of D’. I wrote R functions

Read more »

Forecasting recessions

August 9, 2011
By
Forecasting recessions

John Hussman has a Recession Warning Composite that I am attempting to replicate/improve. The underlying data seems to be easy enough to get from FRED using the quantmod package in R. I don't quite understand the index Hussman is using for commercial...

Read more »

The indices understate the carnage

August 9, 2011
By
The indices understate the carnage

The first 6 trading days of August have been bad for the major indices, but how variable is that across portfolios? To answer that, two sets of random portfolios were generated from the constituents of the S&P 500.  The trading days are 2011 August 1 — 5 and 8. The returns of the indices for … Continue reading...

Read more »

Blog planets are like conferences… (aka R-bloggers.com)

August 8, 2011
By
Blog planets are like conferences… (aka R-bloggers.com)

Blog planets are websites that aggregate blog feeds around a particular topic or project. It is probably called after one of its first implementations, the Planet software. These planets are like conferences, rather than journals. Like conferences with...

Read more »

Installing Rmpi with OpenMPI on Mac OS X Lion

August 8, 2011
By

For whatever reason, Apple decided not to include OpenMPI in Mac OS X Lion (it was supported in Leopard and Snow Leopard). I found this out the hard way after doing a clean install of Lion. Here are steps to install OpenMPI and get it working with the Rmpi package in R. One benefit of

Read more »

How ANZ uses R for credit risk analysis

August 8, 2011
By
How ANZ uses R for credit risk analysis

At last month's R user group meeting in Melbourne, the theme was "Experiences with using SAS and R in insurance and banking". There, Hong Ooi from ANZ (Australia and New Zealand Banking Group) gave a presentation on "Experiences with using R in credit risk". I didn't get to see the presentation myself, but the slides tell a great story...

Read more »

FII and DII turnover with effect on Nifty Downloader

August 8, 2011
By
FII and DII turnover with effect on Nifty Downloader

My thirst for statistics has been increasing. IV had another requirement, which would eventually be useful to me as well. He currently downloads FII and DII buy and sell values and its impact on Nifty manually in Excel. He suggested me to try and autom...

Read more »

Power of running world records

August 8, 2011
By
Power of running world records

Followinga few entries on sports here and there, I was wondering what kind of law follow the running records with respect to the distance. The data are available on Wikipedia, or here for a tidied version. It collects 18 distances, from 100 meters to 100 kilometers. A log-log scale is in order: It is nice

Read more »

Slides from Rocky Mtn SABR Meeting

August 8, 2011
By
Slides from Rocky Mtn SABR Meeting

Last Saturday I had the good fortune to present a talk on finding, gathering, and analyzing some sports-related data on the web at the local SABR group meeting.  In case you’re not familiar with the “SABR” acronym, it stands for … Continue reading →

Read more »

Two-Way PERMANOVA (with Vegan-Function adonis) Using Customized Contrasts

August 8, 2011
By
Two-Way PERMANOVA  (with Vegan-Function adonis) Using Customized Contrasts

...say you have a multivariate dataset and a two-way factorial design - you do a PERMANOVA and the aov-table (adonis is using ANOVA or "sum"-contrasts) tells you there is an interaction - how to proceed when you want to go deeper into the ana...

Read more »

The Open Governing Index: How open is the R project?

August 8, 2011
By

The Open Governing Index is a new measure developed by VisionMobile, that rates open-source projects regarding their governance process. The index has four facets, described thoroughly in the "Open Governance Index" publication, and briefly below. access - These criteria assess the availability of source code, a permissive license, developer support mechanisms, a roadmap, and openness

Read more »