## RcppExamples 0.1.0

March 10, 2010
Version 0.1.0 of RcppExamples, a simple demo package for Rcpp should appear on CRAN some time tomorrow. As mentioned in the post about release 0.7.8 of Rcpp, Romain and I carved this out of Rcpp itself to provide a cleaner separation of code that impl...

## Puzzle of the week [w10]

March 10, 2010
The puzzle in last Saturday edition of Le Monde is made of two parts: Given a 10×10 grid, what is the maximum number of nodes one can highlight before creating a parallelogram with one side parallel to one of the axes of the grid? What is the maximum number of nodes one can highlight before

## Stata Fail

March 10, 2010
From a recent mailing from Stata (highlighting by me): Funnily enough, there is a Daniel Rubin, a bio-informatics person here at Stanford.

## In a nls star things might be different than the lm planet…

March 10, 2010
The nls() function has a well documented (and discussed) different behavior compared to the lm()’s. Specifically you can’t just put an indexed column from a data frame as an input or output of the model. > nls(data ~ c + expFct(data,beta), data = time.data, + start = start.list) Error in parse(text = x) : unexpected

## Clustering the world’s diets

March 10, 2010
Cluster Analysis is a useful technique for classifying the members of a group (people, events, measurements, etc) into "similar" groups. How "similar" is defined depends on the application, but generally involves looking at a number of attributes of the group. For example, we could cluster people by looking at their skin color, hair type, facial features, perhaps even genetic...

## In case you missed it: February roundup

March 10, 2010
In case you missed them, here are some articles from last month of particular interest to R users. We announced the availability on YouTube of "What is R", a 4-part video based on a recent webcast I hosted. We announced a webinar I hosted on REvolution's debugger for R (a recorded replay is now available). We linked Salvio Rodrigues...

## Strategy: what if SPY & VIX are up?

March 10, 2010
Recently, I was busy testing the following strategy: If SPY and VIX daily returns are positive, then short SPY at close and keep it for one day. The strategy is dump simple and it has very good feature – short side. There are not so many successful short side strategies. For testing purpose I used daily Yahoo

## Using Regular Expressions in R: Case Study in Cleaning a BibTeX Database

March 9, 2010
I recently had to clean up a BibTeX database containing around 1,000 references. One of the clean up tasks was to ensure that page numbers were separated with en-dashes as opposed to hyphens. This post sets out how I used regular expressions in R to co...

## Rcpp 0.7.8

March 9, 2010
Version 0.7.8 of the Rcpp R / C++ interface classes is now on CRAN and in Debian. As of right now. Debian has already built packages for eight more architectures; and CRAN has built the Windows binary. Oh, and cran2deb had Debian packages for 'testing'...

## principal components and image reconstruction

March 9, 2010
Jeff Lewis at UCLA told me he teaches principal components with an image reconstruction example. This got me inspired to try it myself. A snapshot appears below, showing how the image quality improves quickly with a relatively small number of principal components. A full, Sweaved write up is here, making use of the biOps package

## Learning R by video

March 9, 2010
For those people who prefer to be shown how to do something rather than read the instructions, there are some videos on using R available online. Here are the ones I know about. Please add links to other similar resources in the comments. R videos Learn R Toolkit What is R? from Revolution Analytics R

## Introducing R on video

March 9, 2010
Darren Wraith pointed out to me this site proposing a whole series of videos introducing to R. (Unfortunately in a Windows environment.) This can be handy when facing students with no R background… Filed under: R, Statistics, University life Tagged: course, video

## Getting the basics from readAligned

March 9, 2010
The UCR guide is a little sparse with regard to getting basic information from readAligned.

I'd like to add to the general cookbook. If some bioc people out there can contribute some alignment recipes can fill me in on some more basics please comment:

alignedReads #how many reads did I attempt to align#i don't think you can't get this from...Read more »

## Cluster analysis of what the world eats

March 9, 2010
Keeping with the theme of the post below, I used a clustering algorithm to group the different countries according to what they eat. I simply played around with the number of clusters until I got something I thought resembled reality, so don't interpre...

## Open Source is Opening Data to Predictive Analytics

March 9, 2010
This article by REvolution Computing CEO Norman Nie is crossposted from the Future of Open Source Forum. The R Project: despite there being over 2 million users of this open-source language for statistical data analysis, you might not have heard of it ... yet. You might have seen this feature in the New York Times last year, and you...

## Chinese versus Japanese editions

March 8, 2010
Last week, I got news from Springer Verlag about possibly two new editions of my books, one in Chinese and one in Japanese. These were bad news and good news: the bad news was that the Chinese edition was actually a reprint of our original book,  Monte Carlo Statistical Method, by a Chinese publishing company.

## White House taps Edward Tufte to explain the stimulus

March 8, 2010
Edward Tufte, a pioneer of effective data visualization (and a personal hero) has just been appointed by the White House to the Recovery Independent Advisory Panel. This panel advises The Recovery Accountability and Transparency Board, whose job is to track and explain \$787 billion in recovery stimulus funds. Tufte explains: I'm doing this because I like accountability and transparency,...

## Weird dietary habits in the US

March 8, 2010
Using this database of food consumption data the blog Canibais e Reis kindly put together, I calculated all values for which the US was at least 2 standard deviations from the world average. Here are the outliers in standard deviations from the w...

## Chilean earthquake: impact of the tsunami

March 8, 2010
The National Oceanic and Atmospheric Administration (NOAA) has a page with some interesting information about last week's earthquake in Chile, but what really stood out for me was this chart of the predicted wave heights around the globe resulting from the associated tsunami: Click to enlarge: it's a fascinating chart. Although labelled a forecast, from the explanations on the...

## Example 7.26: probability question

March 8, 2010
Here's a surprising problem, from the xkcd blog.Suppose I choose two (different) real numbers, by any process I choose. Then I select one at random (p= .5) to show Nick. Nick must guess whether the other is smaller or larger. Being right 50% of the ...

## R: Eliminating observed values with zero variance

March 8, 2010
I needed a fast way of eliminating observed values with zero variance from large data sets using the R statistical computing and analysis platform. In other words, I want to find the columns in a data frame that has zero variance. And as fast as possible, because my data sets are large, many, and changing fast....

