## Predicting Pizza

March 26, 2010
What's the secret to the best pizza in New York? That's what statistical consultant and R user Jared Lander sought to find out, by analyzing the rankings of NY pizza joints at MenuPages.com, and building a regression model for ratings based on variables like localion, price, number of reviews, and pizza-oven type (gas, coal or wood)? Here's a scatterplot...

## Summarising data using dot plots

March 26, 2010
A dot plot is a type of display that compares counts, frequencies, totals or other summary measures for a series of categories. The dot plot can be arranged with the categories either on the vertical or horizontal axis of the display to allow comparising between the different categories as well as comparison within categories where

## BioMart (and biomaRt)

March 26, 2010
I’ve been vaguely aware of BioMart for a few years. Inexplicably, I’ve only recently started to use it. It’s one of the most useful applications I’ve ever used. The concept is simple. You have a set of identifiers that describe a biological object, such as a gene. These are called filters. They have values –

## Finance::YahooQuote 0.23

March 25, 2010
Rule number one in regression testing is to not depend on volatile data. Which I seem to have violated in file t/02simple.t in the Perl package Finance::YahooQuote. Which lead the automated Perl test scripts to remind me for a few days now that the f...

## How Misinformed are Tea Party Protesters About Tax Policy?

March 25, 2010
$How Misinformed are Tea Party Protesters About Tax Policy?$

For those of you used to reading about international relations, I apologize for the following brief foray into American politics. It appears that the American Enterprise Institute and David Frum have decided to (abruptly) part ways. Before David left, however, he and his team of interns provided some interesting statistical insight into the

## R plotting fun

March 25, 2010
Not easy to produce cool looking graphs in R, but it can be done. The results of some messing around are above. Here is the code I used:

## Future of Open Source Survey – Results

March 25, 2010
The results of the 2010 Future of Open Source survey were presented at last week's Open Source Business Conference in San Francisco, and here are they are in slide format: While I was at the presentation I captured a few additional tidbits from the presentation that weren't in the slides. The continued growth of open-source generally was a prevalent...

## A von Mises variate…

March 25, 2010
Inspired from a mail that came along the previous random generation post the following question rised : How to draw random variates from the Von Mises distribution? First of all let’s check the pdf of the probability rule, it is , for . Ok, I admit that Bessels functions can be a bit frightening, but

## Create odf, pdf and html report from a single Sweave document

March 25, 2010
A lot of us know about Sweave and Latex and they work very well in creating elegant dynamic reports from R computation. However, sometimes we would like to also produce a word processing document for a colleague or a html version of the same report. Now there are tools for producing these like odfWeave. But

## NetLogo & R Extension

March 25, 2010
I'm really a heavy user of R and was so much before starting to do any agent based models. So the first thing I was looking in any software package for ABM was some automated link to R (much like spgrass6 for GRASS and R for GIS). I thought  Repast Simphony was the way to go, since...

## Matlab and R (getting started)

March 24, 2010
Matlab and R are two popular languages for data analysis and visualization. The similarity between the two languages is high. Both are interpreted languages that run in a shell-like environment (while also allowing to run scripts or functions written off-line). Both tend to be slow if your code contains many loops but are fast when

## Modified Donchian Band Trend Follower using R, Quantmod, TTR -Part 2: Parameter Sweep Sensitivity over long run

March 24, 2010
Here is a small update to the Donchian Channel type system I displayed in the last post.Fig 1. Sensitivity of Net Combined L/S Gain to parameter n.Using the S&P500 index as a proxy for the market, a simulation was run over the lifetime of the index. No...

## Using vectors to customize placement of tick marks and labels on a plot

March 24, 2010
OK, let's say you want to create a plot and you need an easy way to specify where along the scale your tick marks and labels land, as opposed to having R just decide itself. An easy way to do that is to create a vector and assign the axis characteristi...

## A Demo for the Ratio Estimation in Sampling Survey (Animation)

March 24, 2010
$\reverse \bar{Y}$

mber Watkins gave me a suggestion on the animation for the ratio estimation, and I think this is a good topic for my animation package. I’ve finished writing the initial version of the function sample.ratio() for this package, which will appear in the version 1.1-2 a couple of days later. As we know, the benefit

## ECG Signal Processing

March 24, 2010
After reading (most of) “The Scientists and Engineers Guide to Digital Signal Processing” by Steven W. Smith, PhD, I decided to take a second crack at the ECG data. I wrote a set of R functions that implement a windowed (Blackman) sinc low-pass filter. The convolution of filter kernel with the input signal is conducted

## Statistical learning with MARS

March 24, 2010
Steve Miller at the InformationManagement blog has been looking at predictive analytics tools for business intelligence applications, and naturally turns to the statistical modeling and prediction capabilities of R. Says Steve: The R Project for Statistical Computing continues to dazzle in the open source world, with exciting new leadership at Revolution Computing promising to align commercial R with business...

## RXQuery

March 24, 2010
I have put a new version of the RXQuery package which interfaces to the Zorba XQuery engine. This makes the package compatible with the 1.0.0 release of Zorba for external functions. The package allows one to use XQuery from within R and to use R fu...

## Lessons Learned from EC2

March 24, 2010
A week or so ago I had my first experience using someone else’s cluster on Amazon EC2. EC2 is the Amazon Elastic Compute Cloud. Users set up a virtual computing platform that runs on Amazon’s servers “in the cloud.” Amazon EC2 is not just another cluster. EC2 allows the user to create a disk image containing an operating system...

## Font Families for the R PDF Device

March 24, 2010
otivated by the excellent R package pgfSweave, I begin to notice the font families in my graphs when writing Sweave documents. The default font family for PDF graphs is Helvetica, which is, in most cases (I think), inconsistent with the LaTeX font styles. Some common font families are listed in ?postscript, and we can take

## oro.nifti 0.1.4

March 24, 2010
The latest release of oro.nifti (0.1.4) has been released on CRAN.  New features include:Added text capability in the (unused) fourth pane in orthographic()A vignette is now included (taken from dcemriS4)

## oro.dicom 0.2.5

March 24, 2010
The latest version of oro.dicom (0.2.5) has been released on CRAN.  New features include:Added "mosaic" capability when creating 3D arrays from DICOMdicomTable() now accepts single DICOM fileBetter handling of SequenceItem tags when reading in DIC...