Monthly Archives: August 2011

Quick labels within figures

August 26, 2011
By
Quick labels within figures

One of the coolest R packages I heard about at the useR! Conference: Toby Dylan Hocking‘s directlabels package for putting labels directly next to the relevant curves or point clouds in a figure. I think I first learned about this idea from Andrew Gelman: that a separate legend requires a lot of back-and-forth glances, so

Read more »

Friday quote: what is the question to which this number is the answer?

August 26, 2011
By
Friday quote: what is the question to which this number is the answer?

John Kay muses on interpreting statistical data: Always ask of such data “what is the question to which this number is the answer?”. “Earnings before interest, tax, depreciation and amortisation on a like-for-like basis before allowance for exceptional restructuring costs” is the answer to the question “what is the highest profit number we can present without attracting...

Read more »

Friday quote: what is the question to which this number is the answer?

August 26, 2011
By
Friday quote: what is the question to which this number is the answer?

John Kay muses on interpreting statistical data:Always ask of such data “what is the question to which this number is the answer?”. “Earnings before interest, tax, depreciation and amortisation on a like-for-like basis before allowance for exceptional restructuring costs” is the answer to the question “what is the highest profit number we can present without attracting flat disbelief?”.

Read more »

Le Monde puzzle [#737]

August 26, 2011
By
Le Monde puzzle [#737]

The puzzle in the weekend edition of Le Monde this week can be expressed as follows: Consider four integer sequences (xn), (yn), (zn), and (wn), such that and, if u=(xn,yn,zn,wn), for i=1,…,4, if ui is not the maximum of u and otherwise. Find the first return time n (if any) such that xn=0. Find the value

Read more »

Time series cross-validation: an R example

August 25, 2011
By
Time series cross-validation: an R example

I was recently asked how to implement time series cross-validation in R. Time series people would normally call this “forecast evaluation with a rolling origin” or something similar, but it is the natural and obvious analogue to leave-one-out cross-validation for cross-sectional data, so I prefer to call it “time series cross-validation”. Here is some example

Read more »

Examples on Clustering with R

August 25, 2011
By
Examples on Clustering with R

R code examples on various clustering techniques are available as “Clustering in R” in Chapter 4 of R & Bioconductor Manual by Thomas Girke, UC Riverside. It provides R examples on - Hierarchical Clustering, including tree cutting/coloring and heatmaps, - … Continue reading →

Read more »

Mode vs Mean in Tactical Allocation

August 25, 2011
By
Mode vs Mean in Tactical Allocation

Let’s take Modest Modeest for Moving Average one step further and use it in a basic tactical allocation system using Vanguard funds.  THIS IS NOT INVESTMENT ADVICE AND VERY EASILY MIGHT CAUSE LARGE LOSSES.  VANGUARD FUNDS IMPOSE EARLY REDEM...

Read more »

Major changes to the forecast package

August 25, 2011
By
Major changes to the forecast package

The forecast package for R has undergone a major upgrade, and I’ve given it version number 3 as a result. Some of these changes were suggestions from the forecasting workshop I ran in Switzerland a couple of months ago, and some have been on the drawing board for a long time. Here are the main

Read more »

String functions in R

August 25, 2011
By

Here's a quick cheat-sheet on string manipulation functions in R, mostly cribbed from Quick-R's list of String Functions with a few additional links.substr(x, start=n1, stop=n2)grep(pattern,x, value=FALSE, ignore.case=FALSE, fixed=FALSE)gsub(pattern, replacement, x, ignore.case=FALSE, fixed=FALSE)gregexpr(pattern, text, ignore.case=FALSE, perl=FALSE, fixed=FALSE)strsplit(x, split)paste(..., sep="", collapse=NULL)sprintf(fmt, ...)

Read more »

How to access 100M time series in R in under 60 seconds

August 25, 2011
By
How to access 100M time series in R in under 60 seconds

DataMarket, a portal that provides access to more than 14,000 data sets from various public and private sector organizations, has more than 100 million time series available for download and analysis. (Check out this presentation for more info about DataMarket.) And now with the new package rdatamarket, it's trivially easy to import those time series into R for charting,...

Read more »