Using OpenMP-ized C code with R

August 11, 2011
By

What is OpenMP? Basically a standard compiler extension allowing one to easily distribute calculations over multiple processors in a shared-memory manner (this is especially important when dealing with large data — simple separate-process approach usually requires as many copies of the working data as there are threads, and this may easily be an overkill even

Read more »

Test Difference between Two Proportions & Plot Confidence Intervals

August 11, 2011
By
Test Difference between Two Proportions & Plot Confidence Intervals

..an illustrative example for testing proportions and presenting the results.the data: number of indigenous and alien plant species with and without vegetative reproduction. Hypothesis: The proportion of species with vegetative reproduction is differen...

Read more »

ggplot2: Determining the order in which lines are drawn

August 11, 2011
By
ggplot2: Determining the order in which lines are drawn

In a time series, I want to plot the values of an interesting cluster versus the background. However, if I'm not careful, ggplot will draw the items in an order determined by their name, so background items will obscure the interesting cluster: Corr...

Read more »

Build multiarch R (32 bit and 64 bit) on Debian/Ubuntu

August 11, 2011
By

I have the 64 bit version of R compiled from source on my Ubuntu laptop. I recently had a need for R based on 32 bit since a package I needed to compile and use only works in 32 bit. I thought it was readily available on Ubuntu since both 32 bit and 64 bit... Read more »

Plotting Cumulative FII & DII Inflow against Nifty spot index

August 10, 2011
By
Plotting Cumulative FII & DII Inflow against Nifty spot index

Now that I have the FII and DII Inflow along with Nifty Index, I tried my hands on charting!The objective is to plot the Cumulative FII, DII and Net Inflow and Nifty Index for current year (2011) on a single graph. This post takes inspiration from Deep...

Read more »

Analysis of Japanese Earthquakes Data

August 10, 2011
By
Analysis of Japanese Earthquakes Data

Latest version is here.Greater attention is much more about earthquakes, Touhoku earthquake has occurred on March 11, 2011. Earthquake data are released at Japan Weather Association's Tenki.jp. We can get these variables from the site.Date, TimeAreaLat...

Read more »

Climate datasets in R

August 10, 2011
By

As an ecologist working on climate change questions, I’ve always found it rather tedious to acquire and process climate data, especially when dealing with large spatiotemporal scales. Although many agencies provide free access to climate data, there is often some overhead (typically one to two days) before the data are made available for download via

Read more »

By: Build multiarch R (32 bit and 64 bit) on Debian/Ubuntu

<p> have the 64 bit version of R compiled from source on my Ubuntu laptop. I recently had a need for R based on 32 bit since a package I </p>

Read more »

Error : package does not have a name space

August 10, 2011
By
Error : package does not have a name space

Here is a frustrating knot I am beginning to unravel. R has a namespace structure to help reduce errors resulting from functions, methods, objects, classes having the same name from two different packages. Makes sense. Those R objects that are … Continue reading →

Read more »

In case you missed it: July Roundup

August 10, 2011
By

In case you missed them, here are some articles from July of particular interest to R users. A simulation in R finds the value (or disadvantage) or drawing an X, J, Q or Z in Scrabble. How to display high-quality graphics on the web using SVG output from R. A review of Paul Murrell's talk about raster image support...

Read more »

Bayes factors and martingales

August 10, 2011
By
Bayes factors and martingales

A surprising paper came out in the last issue of Statistical Science, linking martingales and Bayes factors. In the historical part, the authors (Shafer, Shen, Vereshchagin and Vovk) recall that martingales were popularised by Martin-Löf, who is also influential in the theory of algorithmic randomness. A property of test martingales (i.e., martingales that are non

Read more »

SNA: Visualising an email box with R

August 10, 2011
By
SNA: Visualising an email box with R

Are statistics sexy? Visualising social networks certainly is! I wrote a little function, which makes producing beautiful plots depicting a mailbox with R an extremely easy task. I find visualisations of ‘social graphs’ particularly appealing. They look like flowers. I … Continue reading →

Read more »

Dump MySQL to CSV using R

August 10, 2011
By

Based on a related post on one of my favorite python-lists I remembered, that I wrote a similar snipplet some time ago. So if you want to dump your whole MySQL database to csv-files you can recycle the following code: ?Download mysql2cvs.R1 2 3 4 5 6 7 8 9 require(RMySQL) m<-MySQL() summary(m) con<-dbConnect(m, dbname

Read more »

Using the google prediction API from R

August 10, 2011
By
Using the google prediction API from R

Google has a "black box" prediction API that they provide for use with creating recommender systems or filtering spam. Furthermore, they provide an R package for interfacing that API, but try as I might I cannot get it to work under windows. Here are ...

Read more »

Plotting molecular properties for (sub)sets

August 10, 2011
By
Plotting molecular properties for (sub)sets

For a toxicology paper we are writing up, I need to create a few plots showing how the toxic and non-toxic molecules differ (or not) with respect to a few molecular properties, such as logP or the molecular weight. The rcdk package provides all, of cou...

Read more »

A 60-second survey for R users

August 10, 2011
By

I'm doing a little research to validate estimates of the size of the R user community. If you're an R user, please take a minute to complete this three-question survey on R usage at your organization. Thanks in advance. Revolution Analytics: R user base survey

Read more »

Informational Easing: A Change In F.O.M.C. Expectations

August 10, 2011
By
Informational Easing: A Change In F.O.M.C. Expectations

Let's analyze the latest FOMC policy move.The FOMC met yesterday and changed up the communications strategy.  How so? Well, until yesterday the statement has been saying as of June 22, 2011:"The Committee continues to anticipat...

Read more »

Scraping web data in R

August 10, 2011
By
Scraping web data in R

In my last post, I went through a lot of effort to scrape the PMI index off the ISM website.  It turns out that was unnecessary effort, as commentator "senne" pointed out that this index is available from FRED, with the symbol NAPM. &nbs...

Read more »

Using a “pure infographic” to explore differences between information visualization and statistical graphics

August 10, 2011
By
Using a “pure infographic” to explore differences between information visualization and statistical graphics

Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs The post Using...

Read more »

Multiple cores in R, revisited

August 10, 2011
By

The bigmemory package in combination with doMC provides at least a partial solution for sharing a large data set across multiple cores in R. With this solution you can work on the same matrix using several threads. It is also a very scalable solution. ...

Read more »

Coding, GUIs and Statistical Rituals

August 10, 2011
By
Coding, GUIs and Statistical Rituals

I was recently inspired to comment on this blog post, asking is R is a cure for ‘mindless statistics’. Anyone whose familiar with statistics used in applied fields like epidemiology, sociology, social sciences generally will be familiar with the idea of a ‘statistical ritual’. Rather than think about the proper statistical approach to every question,

Read more »

What do you want to see at useR 2012?

August 9, 2011
By

This year's useR! conference at Warwick University is less than a week away, but planning is already underway for useR! 2012, to be held at Vanderbilt University in Nashville. If you're planning to attend, conference organizer Frank Harrell is looking for your input: The 2012 R User Conference - useR! 2012 - will be held in Nashville Tennessee USA,...

Read more »

Amazon Machine Image Created With RTextTools Pre-installed

We recently created an AMI for Amazon's EC2 cloud computing service. Users with AWS accounts can access the public AMI by searching ami-817eb8e8. The AMI is based off of Drew Conway's excellent AMI, but with R 2.13 loaded and RTextTools and

Read more »

What makes a hockey Hall-of-Famer?

August 9, 2011
By
What makes a hockey Hall-of-Famer?

At the JSM conference last week, I stopped by a great poster by Steve Salaga and Brian Mills, graduate students at University of Michigan's Department of Sport Management. The guys were clearly hockey fans, and had channelled their enthusiasm for a sport into an interesting statistical analysis of game and player data from the NHL. One analysis, based on...

Read more »

Estimate decay of linkage disequilibrium with distance

August 9, 2011
By
Estimate decay of linkage disequilibrium with distance

It is well known that linkage disequilibrium (LD) decays with distance. Several functions have been proposed to estimate such decay. Among the most widely used are the Hill and Weir (1) formula for describing the decay of r2 and a formula proposed by Abecasis (2) for describing the decay of D’. I wrote R functions

Read more »

Forecasting recessions

August 9, 2011
By
Forecasting recessions

John Hussman has a Recession Warning Composite that I am attempting to replicate/improve. The underlying data seems to be easy enough to get from FRED using the quantmod package in R. I don't quite understand the index Hussman is using for commercial...

Read more »

The indices understate the carnage

August 9, 2011
By
The indices understate the carnage

The first 6 trading days of August have been bad for the major indices, but how variable is that across portfolios? To answer that, two sets of random portfolios were generated from the constituents of the S&P 500.  The trading days are 2011 August 1 — 5 and 8. The returns of the indices for … Continue reading...

Read more »

Blog planets are like conferences… (aka R-bloggers.com)

August 8, 2011
By
Blog planets are like conferences… (aka R-bloggers.com)

Blog planets are websites that aggregate blog feeds around a particular topic or project. It is probably called after one of its first implementations, the Planet software. These planets are like conferences, rather than journals. Like conferences with...

Read more »

Installing Rmpi with OpenMPI on Mac OS X Lion

August 8, 2011
By

For whatever reason, Apple decided not to include OpenMPI in Mac OS X Lion (it was supported in Leopard and Snow Leopard). I found this out the hard way after doing a clean install of Lion. Here are steps to install OpenMPI and get it working with the Rmpi package in R. One benefit of

Read more »