Using R in Insurance at GIRO 2012

September 17, 2012
Every year the UK’s general insurance actuarial community organises a big conference, which they call GIRO, short for General Insurance Research Organising committee. This year's conference is in Brussels from 18 - 21 September 2012. Despite the fac...

Copulas and tail dependence, part 1

September 17, 2012
As mentioned in the course last week Venter (2003) suggested nice functions to illustrate tail dependence (see also some slides used in Berlin a few years ago). Joe (1990)'s lambda Joe (1990) suggested a (strong) tail dependence index. For lower t...

Why are some things easier to forecast than others?

September 17, 2012
Forecasters are often met with skepticism. Almost every time I tell someone that I work in forecasting, they say something about forecasting the stock market, or forecasting the weather, usually suggesting that such forecasts are hopelessly inaccurate. In fact, forecasts of the weather are amazingly accurate given the complexity of the system, while anyone claiming to forecast the stock...

Permanent Portfolio

September 17, 2012
First, just a quick update: I’m moving the release date of the SIT package a few months down the road, probably in November. Now back to the post. Recently I came across a series of interesting posts about the Permanent Portfolio at the GestaltU blog. Today I want to show you how to back-test the

In search of large ice floes

September 17, 2012
In search of large ice floes.

INLA functions (yet again)

September 17, 2012
This links back to previous posts here and here. Earlier today, I had a quick chat with Michela (by email, actually) on this topic. In particular, she was trying to use the function I've written to compute summaries from the posterior distrib...

Start your new relationship with data together with Roger Peng and 30000 other students

September 17, 2012
A week from today (on September 24) Coursera, an education technology company committed to making education freely available to any person who seeks it, is launching their online course “Computing

Podcast interview with Michael Kane

September 17, 2012
In this podcast interview with Michael Kane, Data Scientist and Associate Researcher at Yale University, Michael discusses the R statistical programming language, computational challenges associated with big data, and two projects involving data analysis he conducted on the stock market "flash crash" of May 6, 2010, and the tracking of transportation routes bird flu H5N1. Michael also...

Simple Parallel randomForest with doMC package

I have been exploring how to speed up some of my R scripts and have started reading about some amazing corners of R. My first weapon was the Rcpp and RcppArmadillo package. These are wonderful tools and even for someone that has never written c++ before, there are enough to examples and documentation to get started. I...

Example 10.2: Custom graphic layouts

September 17, 2012
In example 10.1 we introduced data from a CPAP machine. In brief, it's hard to tell exactly what's being recorded in the data set, but it seems to be related to the pattern of breathing. Measurements are taken five times a second, leading to on the o...

Tips for Making R User Group Videos

September 17, 2012
Today's guest post is from Ron Fredericks, videographer and co-founder of LectureMaker, LLC — ed. I was initially surprised to find R user groups (RUGs) so popular. I filmed my first R session during the 2009 Predictive Analytics World in San Francisco. I filmed several more R user sessions over the past three years along with business/science clients and...

September 17, 2012
I first experimented with word clouds several years ago and used them to visualise the speeches of Kevin Rudd and Malcolm Turnbull. I have now learned from the Fell Stats blog (via R-Bloggers) that there is an R package for generating word clouds.  The package makes use of tm, a text mining package for R, which I have been

Olimpic predictions – from an R web service provider’s point of view

September 17, 2012
Hello, world!Back in July we have read Markus Gesmann’s great blogpost about a prediction for the 100m final in London. Soon we decided to create similar estimates about the forthcoming events and started to post our results on Facebook.We would like to emphasise again that these kind of extrapolated estimates are rather just for fun and we also think...

Variability of garch estimates

September 17, 2012
Not exactly pin-point accuracy. Previously Two related posts are: A practical introduction to garch modeling garch and long tails Experiment 1000 simulated return series were generated.  The garch(1,1) parameters were alpha=.07, beta=.925, omega=.01.  The asymptotic variance for this model is 2.  The half-life is about 138 days. The simulated series used a Student’s t distribution … Continue reading...

Create Beamer/knitr Lecture Slideshow with Bash, Explain the Script with knitr

September 17, 2012
Setting up a beamer slideshow is tedious. Creating new slideshows with the same header/footer/style files every week for your course lectures is very very tedious. To solve this problem I created a simple bash shell script. When you run the script in...

September 17, 2012
Metadata! Metadata is very cool. It's super hot right now - everybody is talking about it. Okay, maybe not everyone, but it's an important part of archiving scholarly work. We are working on a repo on GitHub rmetadata to be a one stop shop for quer...

Online Questionnaire & Report Generation with Google Drive & R

September 17, 2012
Here's how I did it in 3 easy steps: (1) Set up a form in Google Docs/Drive. (2) Choose "Actions" and "Embed in Website" to get the URL for the iframe and put it in a post, like below. Then, go to the spreadsheet view of the form on Google Docs/Drive a...

Etymology

September 16, 2012
Chris and I started this blog as an outlet for the work we were already doing every day: writing code and trying to avoid forgetting how we wrote it. To that end, gist.github.com is an extremely useful resource, and this blog allows us to add a little ...

Changes in optimization performance of gcc over time

September 16, 2012
The SPEC benchmarks came out a year after the first release of gcc (in fact gcc was and still is one of the programs included in the benchmark). Compiling the SPEC programs using the gcc option -O2 (sometimes -O3) has always been the way to measure gcc performance, but after 25 years does this way

The R-Podcast Episode 10: Adventures in Data Munging Part 2

September 16, 2012
I’m happy to present episode 10 of the R-Podcast! Season 1 of the R-Podcast concludes with part 2 of my series on data munging, in which I discuss issues surrounding importing data sets contained in HTML tables. I share how I used the XML and RCurl packages to validate and import data from hockey-reference.com for

What’s the smallest amount you can’t make with 5 coins ?

September 16, 2012
My amazing, awesome wife often comes up with the little puzzles for our amazing children, and this one seemed destined to be solved in R. So, using up to 5 coins (1p, 2p, 5p, 10p, 20p and 50p) first she asked our kids whether they could make every val...

New version of devtools: 0.8

September 16, 2012
We’re pleased to announce a new version of devtools, the package that makes R package development easy. The main features in this version are: A complete rewrite of the code loading system which simulates namespace loading much more accurately – this means using load_all is much closer to installing and loading the package. It also

Confidence Regions for Regression Coefficients

September 16, 2012
Let’s consider the usual linear regression model, with the full set of assumptions:                     y = Xβ + ε ;    ε ~ N , (1)where X is a non-random (n × k) matrix with full column rank.Recall that, under our usual set of assumptions...

Football model

September 16, 2012
After reading Dutch football data (Eeredivisie 2011-2012) and making a predictions display it is time to look at a few simple models to predict goals. To reiterate the data setup, each game played consists of two rows in the data frame. ...

World Cup 2006 First Goal R Analysis

September 16, 2012
Quite a while ago my amazing wife asked me if it was possible to find the time of the first goal for the 2006 FIFA World Cup matches.  I was using R at the time and thought it was possible.  Here are the scripts I wrote to scrape the info fro...

California High School Graduation and Dropout Rates

September 16, 2012
Abstract The California Deparment of Education recently (June 2012) had a news release on the increase in high school (grades 9-12) graduation rates and decrease in dropout rates. The data used by the Department was from two cohorts (4-year periods) o...

project-euler–problem 65

September 16, 2012
The square root of 2 can be written as an infinite continued fraction. $$\sqrt{2} = 1+\frac{1}{2+\frac{1}{2+\frac{1}{2+\frac{1}{2+?}}}}$$ \sqrt{2} = 1+\frac{1}{2+\frac{1}{2+\frac{1}{2+\frac{1}{2+?}}}} The infinite continued fraction can be written, √2 = , (2) indicates that 2 repeats ad infinitum. In a similar way, √23 = . Read More: 1030 Words Totally