...There is not much to it:upload a txt file with your script, share it for anyone with the link, then simply run something like the below code. ps: When using the code for your own purpose mind to change "https" to "http" and to i...

New O'Reilly book on parallel R computation: Looks like it covers snow, multicore, parallel (package), and some others. Anyone know anything more about this book?

Hadley writes: I am going to be teaching an R development master class in New York City on Dec 12-13. The basic idea of the class is to help you write better code, focused on the mantra of “do not repeat yourself”. In day one you will learn powerful new tools of abstraction, allowing The post Wickham...

For the past couple of days, I had been searching for a tutorial that would show how to create a custom Beamer template. I found some great resources and some really great customized templates (I have listed the ones that I referred to below) but none ...

I really love the plyr package. Apart from having a progress bar and plyr handeling a lot of the overhead, a very interesting feature is being able to run plyr in parallel mode. Essentially, setting .parallel = TRUE runs any… See more ›

I just found these two gems about debugging in R on r-help today (here is the thread): 1) posted by Thomas Lumley: traceback() gets you a stack trace at the last error options(warn=2) makes warnings into errors options(error=recover) starts the post-mortem debugger at any error, allowing you to inspect the stack interactively. 2) added by

Kay Cichini recently wrote a word-cloud R function called GScholarScraper on his blog which when given a search string will scrape the associated search results returned by Google Scholar, across pages, and then produce a word-cloud visualisation. This was of interest to me because around the same time I posted an independent Google Scholar scraper function get_google_scholar_df()

(This is a follow-up to my previous post on the topic.)I was encouraged by the appearance of two R-based Scholar-scrapers, within a week of each other. One, by Kay Cichini, converts the page URLs into text mode and scrapes from there (There's a slightl...

Always new software language in one technical activity is difficult, normally a good documentation can help, these are three book to use R software for beginner and for experts: · “Introduction to the R Project for Statistical Computing for Use at the ITC” by David Rossiter (PDF, 2010-11-21).

Today we published version 0.1.5-1 of the ChainLadder package for R. It provides methods which are typically used in insurance claims reserving to forecast future claims payments.Claims development and chain-ladder forecast of the RAA data set using the Mack methodThe package started out of presentations given...

I have become quite a big fan of graphics that combine the features of traditional figures (e.g. bar charts, histograms, etc.) with tables. That is, the combination of numerical results with a visual representation has been quite useful for exploring descriptive statistics. I have wrapped two of my favorites (build around ggplot2) and included them as part

In my last few posts, I have considered “long-tailed” distributions whose probability density decays much more slowly than standard distributions like the Gaussian. For these slowly-decaying distributions, the harmonic mean often turns out to be a much better (i.e., less variable) characterization than the arithmetic mean, which is generally not even well-defined theoretically for these distributions. Since the harmonic...

R version 2.14 introduced a new package, called parallel. This new package combines the functionality from two previous packages: snow and multicore. Since I was using multicore to parallelise my computations, I had to migrate to the new package and decided to publish some code. Often trading strategies are tested using the daily closing price

Here's a cool application of calendar heat maps: runner Andy used R to catalogue his daily running mileage over the last 2+ years: There are lots of ways to chart data like this (a simple time-series chart, for example), but sometimes looking at data in new ways offers fresh perspectives. For example, Andy notes: "Apparently I missed running on...

Revolution Analytics CTO David Champagne visited Hadoop World 2011 this week, and delivered a presentation on "The Powerful Marriage of R and Hadoop" to a standing-room-only crowd of R and Hadoop enthusiasts. I've included David's slides below: The talk also generated praise on Twitter, for example: David was also interviewed by The Cube during the conference. In the video...

Small changes in the input assumptions often lead to very different efficient portfolios constructed with mean-variance optimization. I will discuss Resampling and Covariance Shrinkage Estimator – two common techniques to make portfolios in the mean-variance efficient frontier more diversified and immune to small changes in the input assumptions. Resampling was introduced by Michaud in Efficient

I asked my research group recently what they wished they had learned before they started work on a PhD. Here are some of the responses. More mathematics. Particular topics they named included real analysis, functional analysis, measure theory, algebra, linear algebra. That would have been my response also. I still wish I knew more mathematics than

Casting doubt on the possibility of mean reversion in the S&P 500 lately. Previously A look at volatility estimates in “The mystery of volatility estimates from daily versus monthly returns” led to considering the possibility of autocorrelation in the returns. I estimated an AR(1) model through time and added a naive confidence interval to the … Continue reading...