AIB Stock Price, EGARCH-M, and rgarch

May 17, 2011
By
AIB Stock Price, EGARCH-M, and rgarch

This post examines conditional heteroskedasticity models in the context of daily stock price data for Allied Irish Banks (AIB), specifically how to test for conditional heteroskedasticity in a series, how to approach model specification and estimation when time-varying volatility is present, and how to forecast with these models; all of this is done in R,

Read more »

In case you missed it: April Roundup

May 17, 2011
By

In case you missed them, here are some articles from April of particular interest to R users. The Heritage Health Prize, a competition to build predictive models for hospitalization with USD$3.2M in prizes, is open. The Inside-R.org community site now provides the ability to search and view the help files for CRAN packages. Revolution R Enterprise 4.3 released: R...

Read more »

TreeBASE in R: a first tutorial

May 16, 2011
By
TreeBASE in R: a first tutorial

My TreeBASE R package is essentially functional now.  Here’s a quick tutorial on the kinds of things it can do.  Grab the treebase package here, install and load the library into R. TreeBASE provides two APIs to query the database, one which searches by the metadata associated with different publications (called OAI-PMH), and another which

Read more »

A survey of [the 60's] Monte Carlo methods

May 16, 2011
By
A survey of [the 60's] Monte Carlo methods

“The only good Monte Carlos are the dead Monte Carlos” (Trotter and Tukey, quoted by Halton) When I presented my history of MCM methods in Bristol two months ago, at the Julian Besag memorial, Christophe Andrieu mentioned a 1970 SIAM survey by John Halton on A retrospective and prospective survey of the Monte Carlo

Read more »

My first ‘R’ plot

May 16, 2011
By
My first ‘R’ plot

Started learning 'R'.My first attempt was to plot data from Forbes 1000 list (refer to the exercise posted by Prasoon sharma)Here is a bubble chart showing Forbes top 25 companies by Market CapitalizationSource code:## read the csv fileFORBES...

Read more »

Day #38-39 Data-manipulation Part 1

Last week i created some plots, always for 1 feature. Today I started working on the full script that creates all these plots, 1 per feature. This means, using for loops in R. Let’s see how this is going to work out. Today I mostly worked on data...

Read more »

Omega as Optimizer

May 16, 2011
By
Omega as Optimizer

During Jan Straatman’s presentation, I tweeted Jan Straatman #cfa2011 In real life no normal distributions so use omega function to optimize actual returns After the presentation, I asked Jan his second choice for optimization after Omega, and he re...

Read more »

Get Daily R tips on Twitter

May 16, 2011
By

John D Cook, editor of the always-interesting and eclectic blog The Endeavour, has been posting facts about Statistics and distribution theory to the StatFact Twitter account on a daily basis for over a year now. He also curates a number of other daily tip services and the newest one — RLangTip — promises daily tips about using the R...

Read more »

Technical analysis of Kansas out-of-state Electronic Benefit Transfer payments

May 16, 2011
By

Computer Assisted Reporting Ryan Kath from NBC Action News in Kansas City obtained three months of summary data for out-of-state public assistance payments from the Kansas Department of Social and Rehabilitation Services. These electronic reports wer...

Read more »

Mapping points

May 16, 2011
By
Mapping points

Since I look at mercury concentrations at different measurement stations in North America, visualization using a map with values (of your favourite parameter) plotted as colour-coded circles is quite useful. After some trial & error, here is some very basic code … Continue reading →

Read more »

Example 8.38: WriteXLS to create spreadsheets

May 16, 2011
By
Example 8.38: WriteXLS to create spreadsheets

In our last entry, we described reading Excel files. In this entry, we do the opposite: write native Excel files.RIn R, the WriteXLS package provides this functionality. It uses perl to do the heavy lifting, and the main complication is to install th...

Read more »

Reshape Package in R: Long Data format, to Wide, back to Long again

May 16, 2011
By
Reshape Package in R: Long Data format, to Wide, back to Long again

In this post, I describe how to use the reshape package to modify a dataframe from a long data format, to a wide format, and then back to a long format again. It’ll be an epic journey; some of us may not survive (especially me!). Wide versus Long Data Formats I’ll begin by describing what

Read more »

Day #38-39 Data-manipulation

Last week i created some plots, always for 1 feature. Today I started working on the full script that creates all these plots, 1 per feature. This means, using for loops in R. Let’s see how this is going to work out. Today I mostly worked on data...

Read more »

R Tutorial: Add confidence intervals to dotchart

May 15, 2011
By
R Tutorial:  Add confidence intervals to dotchart

Recently I was working on a data visualization project.  I wanted to visualize summary statistics by category of the data.  Specifically I wanted to see a simple dispersion of data with confidence intervals for each category of data. R i...

Read more »

R Tutorial: Add confidence intervals to dotchart

May 15, 2011
By
R Tutorial:  Add confidence intervals to dotchart

Recently I was working on a data visualization project.  I wanted to visualize summary statistics by category of the data.  Specifically I wanted to see a simple dispersion of data with confidence intervals for each category of data. R i...

Read more »

Why method of moments doesn’t always work

May 15, 2011
By
Why method of moments doesn’t always work

A number of years ago, someone asked me "why does my company need actuaries to fit curves, once I have the mean and standard deviation of my losses, isn't that enough?" I explained to him that not every distribution is completely determined by its mean...

Read more »

Why method of moments doesn’t always work

May 15, 2011
By
Why method of moments doesn’t always work

A number of years ago, someone asked me "why does my company need actuaries to fit curves, once I have the mean and standard deviation of my losses, isn't that enough?" I explained to him that not every distribution is completely determined by its mean...

Read more »

R-Bloggers

May 15, 2011
By
R-Bloggers

This is my first post on the R-Bloggers feed. R-Bloggers is an excellent collection of R-related blogs and sites for R enthusiasts. Add it to your bookmark list, for those who haven’t already done so, and my thanks to those who maintain the site ...

Read more »

Cointegration, R, Irish Mortgage Debt and Property Prices

May 15, 2011
By
Cointegration, R, Irish Mortgage Debt and Property Prices

As a follow-up to my post examining the stationarity of the new property price index, this post will briefly look at some of the dynamics of mortgage debt and property prices; all data is monthly, from the beginning of 2005 to March 2011. This will also serve as an illustration of the ‘vars‘ and ‘urca‘

Read more »

Le Monde puzzle [#14.2]

May 14, 2011
By
Le Monde puzzle [#14.2]

I received at last my weekend edition of Le Monde and hence the solution proposed by the authors (Cohen and Busser) to the puzzle #14. They obtain a strategy that only requires at most 19 steps. The idea is to start with a first test, which gives a reference score S0, and then work on

Read more »

Read zipped file into R

May 14, 2011
By
Read zipped file into R

Sometimes I do not want to unzip files before reading them to R. There is a nice way of reading zipped file (via a tmp dir) into R. Where the file test.csv is actually located in the: ~/files/myzip.zip/test.csv.

Read more »

The New Irish House Price Index

May 14, 2011
By
The New Irish House Price Index

On Friday, the CSO released a new house (and apartment) price index, for the national, Dublin, and national excluding Dublin regions. The release has been noted and covered by the great Irish Economy and Namawinelake blogs. I want to briefly look at some of the statistical properties of this series in more detail. Below is

Read more »

Potential Output and the Irish Output Gap

May 14, 2011
By
Potential Output and the Irish Output Gap

One prominent feature of early degree-level macroeconomics courses is the concept of ‘potential output’, which one could roughly define as the level of output (GDP) at which inflation is not ‘accelerating’. Potential output is of interest to macroeconomists when analysing the question of output gaps and macroeconomic stabilisation policies by governments, whether that be in

Read more »

timezone issue in R

May 14, 2011
By

While investigating Intraday patterns in FX returns and order flow paper I have faced the problem with timezone. I had 3 data sources with different timezones (GMT, CET, CEST). Most confusing thing was, that I didn’t know, how to deal with summer time. But why did I have the data with summer time in the first place?

Read more »

Friday fun projects

May 14, 2011
By
Friday fun projects

What’s a “Friday fun project”? It’s a small computing project, perfect for a Friday afternoon, which serves the dual purpose of (1) keeping your programming/data analysis skills sharp and (2) providing a mental break from the grind of your day job. Ideally, the skills learned on the project are useful and transferable to your work

Read more »

Describing Data: Frequently Used Commands

May 13, 2011
By
Describing Data: Frequently Used Commands

Obtaining a coherent numerical summary of data is a common task, and it is common to want to port these summary statistics into a table of results. When I am in interactive mode with my data, I use the summary() command applied to my data frame. For ...

Read more »

Describing Data: Frequently Used Commands

May 13, 2011
By
Describing Data: Frequently Used Commands

Obtaining a coherent numerical summary of data is a common task, and it is common to want to port these summary statistics into a table of results. When I am in interactive mode with my data, I use the summary() command applied to my data frame. For ...

Read more »

Because it’s Friday: French Press Heat Retention

May 13, 2011
By
Because it’s Friday: French Press Heat Retention

While responding to this thread on Reddit I made a rough guess as to the heat retention of my french press when completely full of coffee. When I went to bed I realized there was no good reason why I … Continue reading →

Read more »

Review of 2011 Data Scientist Summit

May 13, 2011
By
Review of 2011 Data Scientist Summit

Some time over the past 6 weeks I randomly saw a tweet announcing the “Data Scientist Summit” and shortly below it I saw that it would be held in Las Vegas at the Venetian. Being a Data Scientist myself is reason enough to not pass up this opportunity, but Vegas definitely sweetens the deal! On Wednesday I woke up...

Read more »