## High-schoolers celebrate World Statistics Day

October 21, 2011
By

Rose Hoffmann, AP Statistics teacher at Catholic Memorial High School in Waukesha, WI sent the following note to the Revolution Analytics team: In August 2010, my husband who is a statistician attended the American Statistical convention. Your company gave out the flying monkey with a black cape ... He gave me the monkey since it was my first year...

## Teaching with R: the switch

October 21, 2011
By

There are several blog posts, websites (and even books) explaining the transition from using another statistical system (e.g. SAS, SPSS, Stata, etc) to relying on R. Most of that material treats the topic from the point of view of i- … Continue reading →

## ggplot2 for big data

October 21, 2011
By

(Hadley Wickham, author of ggplot2 and several other R packages, guest blogs today about forthcoming big-data improvements to his R graphics package -- ed.) Hi! I'm Hadley Wickham and I'm guest posting on the Revolutions blog to give you a taste of some of the visualisation work that my research team and I worked on this summer. This work...

## Backtesting Part 4: random strategies

October 21, 2011
By

Note: This post is NOT financial advice!  This is just a fun way to explore some of the capabilities R has for importing and manipulating data.   In part 2, we found that our 200-day high, hold 100 days strategy yielded average annual return...

## Predictability of stock returns : Using runs.test()

October 21, 2011
By

Financial market is interesting place, you find people taking positions (buying/selling) based on their expectations of what the security prices would be and are rewarded/penalized according to the accuracy of their expectations. The beauty of financia...

## Volume by Price charts with R – first attempt

October 21, 2011
By

I stumbled upon this chart in the R Graph Gallery, which got me thinking someone could come up with a Volume by Price chart using R. Such charts can be useful to determine support and resistance levels, as they illustrate amount of volume for different price ranges. Below is my first attempt at this. Note

## Generating sets of permutations

October 21, 2011
By

In previous posts I discussed how to generate a single permutation from a fully-randomised or restricted permutation design using shuffle(). Here I want to briefly mention the shuffleSet() function and illustrate it’s usage. Every time you call shuffle() it has to interpret the … Continue reading →

## le Monde puzzle [#745]

October 20, 2011
By

The puzzle in Le Monde this weekend is not that clear (for a change!), so I may be confused in the following exposition: Three card players are betting with a certain (and different) number of chips each, between 4 and 9. After each game, the looser doubles the number of chips of the winner (while

## Since My Last Trip to Disney

October 20, 2011
By

My family is off to DisneyWorld for a week, so there will not be any posts while I am there. However, I thought it would be interesting to see how Disney stock has done since my last trip September 2010.Maybe since Disney has done so poorly, the crowd...

## Slides for Revolution R Enterprise: 100% R and more

October 20, 2011
By

If you haven't yet taken a look at Revolution R Enterprise but wanted to know what is adds to open-source R, the slides below from yesterday's webinar will give you a quick overview: A recorded replay with audio of the me giving the presentation is also available at the link below. Revolution Analytics Webinars: Revolution R Enterprise: 100% R...

## Spatial correlation in designed experiments

October 20, 2011
By

Last Wednesday I had a meeting with the folks of the New Zealand Drylands Forest Initiative in Blenheim. In addition to sitting in a conference room and having nice sandwiches we went to visit one of our progeny trials at … Continue reading →

## Shipping Mix

October 20, 2011
By

With a fresh pile of historical global shipping data, we came back to the flow visualizations that illustrated tangible supply lines that facilitate global trade.  This time we've isolated two types of shipping vessels, cargo and tanker, in order ...

## Queueing up in R, continued

October 20, 2011
By

Shown above is a queueing simulation. Each diamond represents a person. The vertical line up is the queue; at the bottom are 5 slots where the people are attended. The size of each diamond is proportional to the log of the time it will take them to be attended. Color is used to tell one

## postdoctoral positions in Paris

October 20, 2011
By

There is a call for postdoctoral positions supported by the Paris Mathematical Sciences Foundation. The deadline is December 13 and the on-line application is available. If you are interested in working with me on Bayesian statistics  (model choice, time series model) or computational methods (SMC, MCMC, ABC, &c.) thru this call, please contact me at

## Does the S&P 500 exhibit seasonality through the year?

October 20, 2011
By

Are there times of the year when returns are better or worse? Abnormal Returns prompted this question with “SAD and the Halloween indicator” in which it is claimed that the US market tends to outperform from about Halloween until April. Data The data consisted of 15,548 daily returns of the S&P 500 starting in 1950.  … Continue reading...

## Confidence interval diagram in R

October 19, 2011
By

This code shows how to easily plot a beautiful confidence interval diagram in R. First, let’s input the raw data. We’ll be making two confidence intervals for two samples of 10. In case you curious, the data represents samples from … Continue reading →

## R. I. P. EMA

October 19, 2011
By

That’s right, I am moving away from exponential moving averages. Originally, I decided to use them somewhat arbitrary, probably because they tend to swing faster. Last night, after spending two and half hours debugging an issue which yet again turned out to be a particular property of these averages, I made my mind. I am

## Minimum Investment and Number of Assets Portfolio Cardinality Constraints

October 19, 2011
By

The Minimum Investment and Number of Assets Portfolio Cardinality Constraints are practical constraints that are not easily incorporated in the standard mean-variance optimization framework. To help us impose these real life constraints, I will introduce extra binary variables and will use mixed binary linear and quadratic programming solvers. Let’s continue with our discussion from Introduction

## the Wang-Landau algorithm reaches the flat histogram in finite time

October 19, 2011
By

Pierre Jacob and Robin Ryder (from Paris-Dauphine, CREST, and Statisfaction) have just arXived (and submitted to the Annals of Applied Probability) a neat result on the Wang-Landau algorithm. (This algorithm, which modifies the target in a sort of reweighted partioned sampling to achieve faster convergence, has always been perplexing to me.)  They show that some

## Support Vector Machines in R (a course by Lutz Hamel)

October 19, 2011
By

Support vector machines (SVM’s) are the “big iron” of the data mining world, especially suited for extreme data intensive tasks like image classification, biosequence processing, handwriting recognition, etc. Dr. Lutz Hamel, author of “Knowledge Discovery with Support Vector Machines”, presents his online course “Introduction to Support Vector Machines In R” November 18 – December 16. “Support Vector Machines in...

## Web-friendly visualizations in R

October 19, 2011
By

Aleks points me to this new tool from Wojciech Gryc. Right now I save my graphs as pdfs or pngs and then upload them to put them on the web. I expect I’ll still be doing this for awhile—I like having full control of what my graphs look like—but Gryc’s default plots might be useful The post Web-friendly...

## On R, bloggers, politics, sex, alcohol and rock & roll

October 19, 2011
By

Yesterday morning at 7 am I was outside walking the dog before getting a taxi to go to the airport to catch a plane to travel from Christchurch to Blenheim (now I can breath after reading without a pause). It … Continue reading →

## The R-Files: Paul Teetor

October 19, 2011
By

"The R-Files" is an occasional series from Revolution Analytics, where we profile prominent members of the R Community. Name: Paul Teetor Profession: Quantitative developer (freelance) Nationality: American Years Using R: 7 Known for: Author of R Cookbook (O’Reilly Media, 2011) An active member of the R community, Paul Teetor is a quantitative developer and statistical consultant based in the...

## Studying market reactions after consecutive gains (losses)

October 19, 2011
By

Arthur Charpentier used R to denote a broken record of the CAC 40 when it went 11 consecutive days with negative returns. Question: What happens to the market after runs of positive or negative returns? Will the market tank or soar after n days of gains/losses? First, a little dissection of historical data (S&P 500

## How does Matt kemp become Andre Dawson?

October 18, 2011
By

While reading this article over at Fangraphs I was inspired to ask myself “what would Matt Kemp have to do between now and then end of his career to be seriously considered for the Hall of Fame?”.  This question comes … Continue reading →

October 18, 2011
By

Google's Fusion Tables look impressive, for those who want to try geo-visualizations of their data. You don't need much programming experience to be able to use it.For those who want to try it out, here's a nice intro that Kathyrn Hurley presented at the recent SVCC (Silicon Valley Code Camp). When combined with ShpEscape (note spelling) it becomes...

## Generating restricted permutations with permute

October 18, 2011
By

In a previous post I introduced the permute package and the function shuffle(). In that post I got as far as replicating R’s base function sample(). Here I’ll briefly outline how shuffle() can be used to generate restricted permutations. shuffle() … Continue reading →

## 130/30 Porfolio Construction

October 18, 2011
By

The 130/30 funds were getting lots of attention a few years ago. The 130/30 fund is a long/short portfolio that for each $100 dollars invested allocates$130 dollars to longs and \$30 dollars to shorts. From portfolio construction perspective this simple idea is no so simple to implement. Let’s continue with our discussion from Introduction