Last Wednesday I had a meeting with the folks of the New Zealand Drylands Forest Initiative in Blenheim. In addition to sitting in a conference room and having nice sandwiches we went to visit one of our progeny trials at … Continue reading →

There is a call for postdoctoral positions supported by the Paris Mathematical Sciences Foundation. The deadline is December 13 and the on-line application is available. If you are interested in working with me on Bayesian statistics (model choice, time series model) or computational methods (SMC, MCMC, ABC, &c.) thru this call, please contact me at

Are there times of the year when returns are better or worse? Abnormal Returns prompted this question with “SAD and the Halloween indicator” in which it is claimed that the US market tends to outperform from about Halloween until April. Data The data consisted of 15,548 daily returns of the S&P 500 starting in 1950. … Continue reading...

That’s right, I am moving away from exponential moving averages. Originally, I decided to use them somewhat arbitrary, probably because they tend to swing faster. Last night, after spending two and half hours debugging an issue which yet again turned out to be a particular property of these averages, I made my mind. I am

The Minimum Investment and Number of Assets Portfolio Cardinality Constraints are practical constraints that are not easily incorporated in the standard mean-variance optimization framework. To help us impose these real life constraints, I will introduce extra binary variables and will use mixed binary linear and quadratic programming solvers. Let’s continue with our discussion from Introduction

Pierre Jacob and Robin Ryder (from Paris-Dauphine, CREST, and Statisfaction) have just arXived (and submitted to the Annals of Applied Probability) a neat result on the Wang-Landau algorithm. (This algorithm, which modifies the target in a sort of reweighted partioned sampling to achieve faster convergence, has always been perplexing to me.) They show that some

Support vector machines (SVM’s) are the “big iron” of the data mining world, especially suited for extreme data intensive tasks like image classification, biosequence processing, handwriting recognition, etc. Dr. Lutz Hamel, author of “Knowledge Discovery with Support Vector Machines”, presents his online course “Introduction to Support Vector Machines In R” November 18 – December 16. “Support Vector Machines in...

Aleks points me to this new tool from Wojciech Gryc. Right now I save my graphs as pdfs or pngs and then upload them to put them on the web. I expect I’ll still be doing this for awhile—I like having full control of what my graphs look like—but Gryc’s default plots might be useful The post Web-friendly...

"The R-Files" is an occasional series from Revolution Analytics, where we profile prominent members of the R Community. Name: Paul Teetor Profession: Quantitative developer (freelance) Nationality: American Years Using R: 7 Known for: Author of R Cookbook (O’Reilly Media, 2011) An active member of the R community, Paul Teetor is a quantitative developer and statistical consultant based in the...

Arthur Charpentier used R to denote a broken record of the CAC 40 when it went 11 consecutive days with negative returns. Question: What happens to the market after runs of positive or negative returns? Will the market tank or soar after n days of gains/losses? First, a little dissection of historical data (S&P 500

Google's Fusion Tables look impressive, for those who want to try geo-visualizations of their data. You don't need much programming experience to be able to use it.For those who want to try it out, here's a nice intro that Kathyrn Hurley presented at the recent SVCC (Silicon Valley Code Camp). When combined with ShpEscape (note spelling) it becomes...

The 130/30 funds were getting lots of attention a few years ago. The 130/30 fund is a long/short portfolio that for each $100 dollars invested allocates $130 dollars to longs and $30 dollars to shorts. From portfolio construction perspective this simple idea is no so simple to implement. Let’s continue with our discussion from Introduction

(By Joseph Rickert.) In San Jose topics like big data, map reduce, predictive models, mobile analytics and crowdsourcing draw a crowd even on a Saturday. So it turned out that the ACM data Mining Camp and "un-conference" was a very "happening" way to spend a Saturday. Over 500 people attended the event at the Ebay "Town Hall" on North...

In a previous post I introduced the permute package and the function shuffle(). In that post I got as far as replicating R’s base function sample(). Here I’ll briefly outline how shuffle() can be used to generate restricted permutations.

I am always looking for suggestions on how to get better at R, esp. for beginners. So when I see someone who's gotten adept at it, I ask them how they got there.This weekend, at the Bay Area ACM Data Mining Camp, one person gave me what seemed like a g...

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full October edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Applications of R Contest: Deadline October 31. Revolution Analytics is offering $20,000 in prizes...