## Loading Big (ish) Data into R

November 24, 2009
By

So for the rest of this conversation big data == 2 Gigs. Done. Don’t give me any of this ‘that’s not big, THIS is big’ shit. There now, on with the cool stuff: This week on twitter Vince Buffalo asked about loading a 2 gig comma separated file (csv) into R (OK, he asked about tab

## ESS on Mac OS X

November 24, 2009
By

One of the search terms that bring people frequently to my site is "install ESS on Mac OS X" or something like that. As it turns out installing ESS on OS X is really easy, but Google search does not really bring up good instructions. There are at least two easy options: Use Aquamacs, it comes bundled with...

## ESS on Mac OS X

November 24, 2009
By

One of the search terms that bring people frequently to my site is "install ESS on Mac OS X" or something like that. As it turns out installing ESS on OS X is really easy, but Google search does not really bring up good instructions. There are at least two easy options: Use Aquamacs, it comes bundled with...

## NYT: SAS threatened by R

November 23, 2009
By

The New York Times had an interesting piece yesterday about how SAS is facing several business threats from companies like the recently IBM-acquired SPSS, and from burgeoning interest in open-source software like R.  The NYT ran an entire article about R earlier this year, and this article discusses how SAS has been revamping their technology to work seamlessly with...

## RQuantlib

November 23, 2009
By

Quantlib is a free library for modeling, trading, and risk management in real-life providing a comprehensive software framework for quantitative finance, it is written in C++, which might be inconvenient for some users. JQuantLib aiming at Java-fans i...

## Memory Management in R: A Few Tips and Tricks

November 23, 2009
By

This post discusses a few strategies that I have used to to manage memory in  R.Stack Overflow TipsStack Overflow has a thread on Memory Management Tricks. I tend to follow these suggestions:.ls.objects(): There's a nice function (.ls.objects...

## Type II Error

November 22, 2009
By

In hypothesis testing, a type II error is due to a failure of rejecting an invalid null hypothesis. The probability of avoiding a type II error is called the power of the hypothesis test, and is denoted by the quantity 1 - β . read more

## Type II Error

November 22, 2009
By

In hypothesis testing, a type II error is due to a failure of rejecting an invalid null hypothesis. The probability of avoiding a type II error is called the power of the hypothesis test, and is denoted by the quantity 1 - β . read more

## Some sort of update to ggplot2

November 22, 2009
By

Jeroen Ooms writes: Here's a first version of a new web application for exploratory graphical analysis. It attempts to implement the layered graphics from the R package ggplot2 in a user-friendly way. This two-minute demo video demonstrates a ...

## new R package : highlight

November 22, 2009
By

I finally pushed highlight to CRAN, which should be available in a few days. The package uses the information gathered by the parser package to perform syntax highlighting of R code The main function of the package is highlight, which takes a numb...

## R examine objects tutorial

November 21, 2009
By

This article is quick concrete example of how to use the techniques from Survive R to lower the steepness of The R Project for Statistical Computing‘s learning curve (so an apology to all readers who are not interested in R). What follows is for people who already use R and want to achieve more control Related posts:

## My implementation of Berry and Berry’s hierarchical Bayes algorithm for adverse events

November 20, 2009
By

I've been working on this for quite some time (see here for a little background), so I'm pleased that it looks close to done at least as far as the core algorithm. It uses global variables for now, and I'm sure there are a couple of other bugs lurking, but here it is, after the jump.const.sqrt2pi <-...

## Mapping Biomes

November 20, 2009
By

Recently (2008) the European Space Agency produced GlobCover (ESA GlobCover Project, led by MEDIAS-France), the highest resolution (300m) global land cover map to date. GlobCover uses 21 primary land cover classes and many more sub-classes. Land cover classification (LCC) schemes divide the earth into biomes. Biomes are the simplest way to classify vegetation which can

## Working on a drug safety project

November 20, 2009
By

In order to move some of my personal interests along, I have been trying to implement the methodology found in Berry and Berry's article Accounting for Multiplicities in Assessing Drug Safety. This methodology uses the MedDRA hierarchy to improve the p...

## Tactical asset allocation using blotter

November 18, 2009
By

blotter is an R package that tracks the P&L of your trading systems (or simulations), even if your portfolio spans many security types and/or currencies. This post uses blotter to track a simple two-ETF trading system. The contents of this post b...

## Design of Experiments – Power Calculations

November 18, 2009
By

Prior to conducting an experiment researchers will often undertake power calculations to determine the sample size required in their work to detect a meaningful scientific effect with sufficient power. In R there are functions to calculate either a minimum sample size for a specific power for a test or the power of a test for

## Confidence we seek…

November 18, 2009
By
$Confidence we seek…$

Estimating a proportion at first looks elementary. Hail to aymptotics, right? Well, initially it might seem efficient to iuse the fact that . In other words the classical confidence interval relies on the inversion of Wald’s test. A function to ease the computation is the following (not really needed!). waldci<- function(x,n,level){ phat<-sum(x)/n results<-phat + c(-1,1)*qnorm(1-level/2)*sqrt(phat*(1-phat)/n) print(results) } An exact confidence interval is

## Quantitative link strength for APE cophyloplot

November 17, 2009
By

Just add a third column with link strength to the association matrix plotCophylo2 <- function (x, y, assoc = assoc, use.edge.length = use.edge.length, space = space, length.line = length.line, gap = gap, type = type, return = return, col = col, show.tip.label = show.tip.label, font = font) { if(ncol(assoc)==2) { assoc <- cbind(assoc,rep(1,nrow(assoc))) } res

## swfDevice is nearing completion

November 17, 2009
By

My new R package, swfDevice, is getting close to its first release. This package enables native R graphics output as swf (flash) files. It also as the ability to create animations with player controls. The main project page is here and the results of the test suite are here. Here are some samples: http://swfdevice.r-forge.r-project.org/swfDevice_test29.swf http://swfdevice.r-forge.r-project.org/swfDevice_test28.swf

## R tip: Extracting median from survfit object

November 17, 2009
By

A colleague wanted to extract the median value from a survival analysis object, which turned out to be a pain as the value is not stored in the object, but calculated on the fly by a print method.> library(survival)> fit > survfit(fit)Call: survfit(formula = fit)records n.max n.start events median 0.95LCL 0.95UCL ...

## R tip: Extracting median from survfit object

November 17, 2009
By

A colleague wanted to extract the median value from a survival analysis object, which turned out to be a pain as the value is not stored in the object, but calculated on the fly by a print method. > library(survival)> fit > survfit(fit)Call: survfit(formula = fit)records n.max n.start events median 0.95LCL 0.95UCL ...

## R functions for Dienes (2008) Understanding Psychology as a Science

November 17, 2009
By

I recently wrote a review of Understanding psychology as a science: an introduction to scientific and statistical inference by Zoltan Dienes (2008). Dienes' book covers Neyman-Pearson null hypothesis significance testing, Bayesian inference and the lik...

## Seminar: Reproducible Research with R, LaTeX, & Sweave

November 16, 2009
By

Theresa Scott, instructor of the previously mentioned R workshop and weekly R clinic, is giving a lecture entitled "Reproducible Research with R, LaTeX, & Sweave" in MRB III, room 1220, this Wednesday 11/18 at 1:30.  You can see more details about the lecture here. Looks like her slides as well as much more introductory material on R, Latex, and Sweave...

## Infomaps using R – Visualizing German unemployment rates by district on a map

November 16, 2009
By

Lately, David Smith from REvolution Computing set out to challenge the R community with the reprocuction of a beautiful choropleth map (= multiple regions map/thematic map) on US unemployment rates he had seen on the Flowing Data blog. Here you can find the impressing results. Being a fan of beautiful visualizations I tried to produce

## R in Action – early thoughts

November 16, 2009
By

I was invited to review the book R in Action written by Rob Kabacoff. Since I consider the Quick-R website, created by the same smart guy, one of the most valuable resources about R, It is both an honor and a pleasure to have the opportunity to take an...

## R in Action – early thoughts

November 16, 2009
By

I was invited to review the book R in Action written by Rob Kabacoff. Since I consider the Quick-R website, created by the same smart guy, one of the most valuable resources about R, It is both an honor and a pleasure to have the opportunity to take an...

## The Top Scores for Canabalt, Take 2

November 15, 2009
By

Introduction As promised on Thursday, here’s my second pass at a statistical analysis of Canabalt scores. There are some useful results I’ll present right at the start, and then there are some results that are more or less worthless, except that working through my own mistakes helped me to think more clearly about statistical modeling in

## OpenMX

November 15, 2009
By

Looks promising: http://openmx.psyc.virginia.edu/Right now it cannot be build from source because there are some comparabilities between OpenMx and R 2.10.0, but I assume this will be resolved soon.And the development seems to be quite active.