I am talking here about money managers. for those of us who have one. We assume they understand about markets in such a way that they can, and will generate at least the benchmark returns, what ever this benchmark may … Continue reading →

(This is a follow-up to my previous post on the topic.)I was encouraged by the appearance of two R-based Scholar-scrapers, within a week of each other. One, by Kay Cichini, converts the page URLs into text mode and scrapes from there (There's a slightl...

Always new software language in one technical activity is difficult, normally a good documentation can help, these are three book to use R software for beginner and for experts: · “Introduction to the R Project for Statistical Computing for Use at the ITC” by David Rossiter (PDF, 2010-11-21).

Today we published version 0.1.5-1 of the ChainLadder package for R. It provides methods which are typically used in insurance claims reserving to forecast future claims payments.Claims development and chain-ladder forecast of the RAA data set using the Mack methodThe package started out of presentations given...

I have become quite a big fan of graphics that combine the features of traditional figures (e.g. bar charts, histograms, etc.) with tables. That is, the combination of numerical results with a visual representation has been quite useful for exploring descriptive statistics. I have wrapped two of my favorites (build around ggplot2) and included them as part

In my last few posts, I have considered “long-tailed” distributions whose probability density decays much more slowly than standard distributions like the Gaussian. For these slowly-decaying distributions, the harmonic mean often turns out to be a much better (i.e., less variable) characterization than the arithmetic mean, which is generally not even well-defined theoretically for these distributions. Since the harmonic...

R version 2.14 introduced a new package, called parallel. This new package combines the functionality from two previous packages: snow and multicore. Since I was using multicore to parallelise my computations, I had to migrate to the new package and decided to publish some code. Often trading strategies are tested using the daily closing price

Here's a cool application of calendar heat maps: runner Andy used R to catalogue his daily running mileage over the last 2+ years: There are lots of ways to chart data like this (a simple time-series chart, for example), but sometimes looking at data in new ways offers fresh perspectives. For example, Andy notes: "Apparently I missed running on...

Revolution Analytics CTO David Champagne visited Hadoop World 2011 this week, and delivered a presentation on "The Powerful Marriage of R and Hadoop" to a standing-room-only crowd of R and Hadoop enthusiasts. I've included David's slides below: The talk also generated praise on Twitter, for example: David was also interviewed by The Cube during the conference. In the video...

Small changes in the input assumptions often lead to very different efficient portfolios constructed with mean-variance optimization. I will discuss Resampling and Covariance Shrinkage Estimator – two common techniques to make portfolios in the mean-variance efficient frontier more diversified and immune to small changes in the input assumptions. Resampling was introduced by Michaud in Efficient

I asked my research group recently what they wished they had learned before they started work on a PhD. Here are some of the responses. More mathematics. Particular topics they named included real analysis, functional analysis, measure theory, algebra, linear algebra. That would have been my response also. I still wish I knew more mathematics than

Casting doubt on the possibility of mean reversion in the S&P 500 lately. Previously A look at volatility estimates in “The mystery of volatility estimates from daily versus monthly returns” led to considering the possibility of autocorrelation in the returns. I estimated an AR(1) model through time and added a naive confidence interval to the … Continue reading...

So in prepping for my latest manuscript on population dynamics I have been creating all the necessary figures. One of them I considered was a 2-d surface plot of a modified Ricker equation showing the transitions from extinction stability, and stability to limit cycles. Inconveniently though the only way to do this is with an implicit function. Since becoming...

Time Series as calendar heat maps + All of my running data since April 1, 2009 = Generated by the following code: #Sample Code based on example program at: source(file = "calendarHeat.R") run<- read.csv("log.csv", header = TRUE, sep=",") sum(run$Distance) date <- c() for (i in 1: dim(run)){ if(run$DistanceUnit== 'Kilometer'){ miles <- c(miles,run$Distance * 0.62) }

I will be teaching a workshop on R and LaTeX at NEAIR in just under a month. One of the issues I will encounter is a lack of Internet access. I also work with restricted data from NCES which requires the computer to be secured including no network access. As such, I need to manage software from removable

Today I needed to cut out a rectangle of geologic data from a state-wide map in an AEA coordinate system, using a bounding box from a UTM zone 10 region, with the output saved in UTM zone 10 coordinates. PostGIS makes this type of operation very simple...