Coming out of the (Bayesian) closet: multivariate version

November 7, 2011
By
Coming out of the (Bayesian) closet: multivariate version

This week I’m facing my—and many other lecturers’—least favorite part of teaching: grading exams. In a supreme act of procrastination I will continue the previous post, and the antepenultimate one, showing the code for a bivariate analysis of a randomized … Continue reading →

Read more »

Web Scraping Google URLs

November 7, 2011
By
Web Scraping Google URLs

Google slightly changed the html code it uses for hyperlinks on search pages last Thursday, thus causing one of my scripts to stop working. Thankfully, this is easily solved in R thanks to the XML package and the power and simplicity of XPath expressions: Lovely jubbly! P.S. I know that there is an API of

Read more »

Code Optimization: One R Problem, Eleven Solutions – Now Thirteen!

November 7, 2011
By
Code Optimization: One R Problem, Eleven Solutions – Now Thirteen!

Following up from my previous post “Code Optimisation: One R Problem, Ten Solutions – Now Eleven!” I figured out a twelfth solution after writing that blog post. Furthermore, half way through writing this blog post I figured out a thirteenth solution too. As a recap, the problem is taken from rwiki where the goal is to find

Read more »

project euler-Problem 41

November 7, 2011
By

We shall say that an n-digit number is pandigital if it makes use of all the digits 1 to n exactly once. For example, 2143 is a 4-digit pandigital and is also prime. What is the largest n-digit pandigital prime that exists? Read More: 288 Words Totally

Read more »

Bayesian modeling using WinBUGS

November 6, 2011
By
Bayesian modeling using WinBUGS

Yes, yet another Bayesian textbook: Ioannis Ntzoufras’ Bayesian modeling using WinBUGS was published in 2009 and it got an honourable mention at the 2009 PROSE Award. (Nice acronym for a book award! All the mathematics books awarded that year were actually statistics books.) Bayesian modeling using WinBUGS is rather similar to the more recent Bayesian

Read more »

Rcpp talk at Seattle RUG next month

November 6, 2011
By

The Seattle R User Group was kind enough to invite me to give a talk about R, C++ and Rcpp. So if you can make it to the Thomas building of the Fred Hutchinson Cancer Research Center in Seattle, WA, on December 7, I would love to see you there. I ha...

Read more »

More colour wheels

November 5, 2011
By
More colour wheels

In response to my post about colour wheels, I received a suggested enhancement from Drew. The idea is to first match colours based on the text provided and then add nearby colours. This can be done by ordering colours in terms of hue, saturation, and value. The result is a significant improvement and it will capture all of

Read more »

Kaplan-Meier Survival Plot – with at risk table

November 5, 2011
By
Kaplan-Meier Survival Plot – with at risk table

Credit for the bulk of this code is to Abhijit Dasgupta and the commenters on the original post here from earlier this year. I have made a few changes to the functionality of this which I think warrant sharing. A brief … Continue reading →

Read more »

Next Level Web Scraping

November 5, 2011
By
Next Level Web Scraping

The outcome presented above will not be very useful to most of you - still, this could be a good example for what possibly can be done via web scraping in R.Background: TIRIS is the federal geo-statistical service of North-Tyrol, Austria. One of many g...

Read more »

Vectors (CloudStat)

November 5, 2011
By

The simplest type of data object in R is a vector, which is simply an ordered set of values. Some further examples of creating vectors are shown below: Input: 1:20 Output: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 This creates...

Read more »

#2 Data Classes (CloudStat)

November 5, 2011
By

As stated in CloudStat Intro, we know that CloudStat is based on R Language, an object orientated language, everything in R is an object. Each object has a class. The simplest data objects are one-dimensional arrays called vectors, consisting of any nu...

Read more »

The Joy of R: A Feline Guide

November 5, 2011
By
The Joy of R: A Feline Guide

Just because it’s caturday Images by Mario Pineda-Krch (CC BY-NC-SA 3.0) This is from the “Mario’s Entangled Bank” blog ( http://pineda-krch.com ) of Mario Pineda-Krch, a theoretical biologist at the University of Alberta. Filed under: cats, computing, humour, R, Sweave

Read more »

Colour wheels in R

November 5, 2011
By
Colour wheels in R

Regular readers will know I use the R package to produce most of the charts that appear here on the blog. Being more quantitative than artistic, I find choosing colours for the charts to be one of the trickiest tasks when designing a chart, particularly as R has so many colours to choose from. In

Read more »

Data Referenced Journalism and the Media – Still a Long Way to Go Yet?

November 4, 2011
By
Data Referenced Journalism and the Media – Still a Long Way to Go Yet?

Reading our local weekly press this evening (the Isle of Wight County Press), I noticed a page 5 headline declaring “Alarm over death rates at St Mary’s”, St Mary’s being the local general hospital. It seems a Department of Health report on hospital mortality rates came out earlier this week, and the Island’s hospital, it

Read more »

Unit root versus breaking trend: Perron’s criticism

November 4, 2011
By
Unit root versus breaking trend: Perron’s criticism

I came across an ingenious simulation by Perron during my Time-series lecture which I thought was worth sharing. The idea was to put your model to a further test of breaking trend before accepting the null of unit root. Let me try and illustrate this in simple language. A non-stationary time series is one that has its mean changing...

Read more »

Generating PPC Keywords in R – Part 2

November 4, 2011
By

In a previous post, I discussed how to generate PPC keywords in R. In this post I will provide another example of how to perform this task. Let’s say that I am a auto insurance company that only operates in the state of Illinois. I’m planing on bidding on keywords in Bing and Google which

Read more »

Rdatamarket Tutorial

November 4, 2011
By

The good folks at DataMarket have posted a new tutorial on using the rdatamarket package (covered here in August) to easily download public data sets into R for analysis. The tutorial describes how to install the rdatamarket package, how to extract metadata for data sets, and how to download the data themselves into R. The tutorial also illustrates a...

Read more »

match vs. %in%

November 4, 2011
By

match and %in% are two very commonly-used function in R. So, what's the difference of them?First, how to use them -- (copy from R manual)match returns a vector of the positions of (first) matches of its first argument in its second.%in% is a ...

Read more »

Confidence interval for predictions with GLMs

November 4, 2011
By
Confidence interval for predictions with GLMs

Consider a (simple) Poisson regression . Given a sample where , the goal is to derive a 95% confidence interval for given , where is the prediction. Hence, we want to derive a confidence interval for the prediction, not the potential observation...

Read more »

Confidence interval for predictions with GLMs

November 4, 2011
By
Confidence interval for predictions with GLMs

Consider a (simple) Poisson regression . Given a sample where , the goal is to derive a 95% confidence interval for given , where is the prediction. Hence, we want to derive a confidence interval for the prediction, not the potential observation, i.e. the dot on the graph below > r=glm(dist~speed,data=cars,family=poisson) > P=predict(r,type="response", + newdata=data.frame(speed=seq(-1,35,by=.2))) > plot(cars,xlim=c(0,31),ylim=c(0,170)) > abline(v=30,lty=2)...

Read more »

Factor to class-membership matrix

November 4, 2011
By
Factor to class-membership matrix

Recently on R-bloggers I found a post from chem-bla-ics blog concerning conversion of factors to integer vectors. At the end it stated a problem of conversion of factor variable to class-membership matrix. In comments several nice solutions were p...

Read more »

Help: stemming and stem completion with package tm in R

November 3, 2011
By
Help: stemming and stem completion with package tm in R

I came across a problem below when doing stemming and stem completion with package tm in R. Word “mining” was stemmed to “mine” with stemDocument(), and then completed to “miners”with stemCompletion(). However, I prefer to keep “mining” intact. For stemCompletion(), … Continue reading →

Read more »

Webinar on Portfolio Rebalancing with R and Sybase

November 3, 2011
By

R users in the financial industry may be interested in the following webinar hosted by Revolution Analytics' partner Sybase on November 10: Portfolio Rebalancing Using R and Sybase RAP for Intraday Risk Management With volatility and violent intraday swings becoming the new normal, intraday risk controls are now needed to not only reduce your exposures across multiple asset classes,...

Read more »

By: Super Nerdy Cool » Build multiarch R (32 bit and 64 bit) on Debian/Ubuntu

have the 64 bit version of R compiled from source on my Ubuntu laptop. I recently had a need for R based on 32 bit since a package I

Read more »

Modern Portfolio Optimization Theory: The idea

November 3, 2011
By
Modern Portfolio Optimization Theory: The idea

We were recently given a lecture (by Dr. Susan Thomas) on Harry Markowitz portfolio optimization theory, and I was really fascinating with the noble laureate's story of how he found it difficult to convince his guide about the importance of h...

Read more »

Variability of volatility estimates from daily returns

November 3, 2011
By
Variability of volatility estimates from daily returns

Investment Performance Guy has a post “Periodicity of risk statistcs (and other measures)” in which it is wondered how valid volatility estimates are from a month of daily returns. Here is a quick look.  Figure 1 shows the variability (and a 95% confidence interval) of volatility estimates for the S&P 500 index in January 2011.  … Continue reading...

Read more »

Maximizing Omega Ratio

November 3, 2011
By
Maximizing Omega Ratio

The Omega Ratio was introduced by Keating and Shadwick in 2002. It measures the ratio of average portfolio wins over average portfolio losses for a given target return L. Let x.i, i= 1,…,n be weights of instruments in the portfolio. We suppose that j= 1,…,T scenarios of returns with equal probabilities are available. I will

Read more »

Some Simple but Propably Useful Regex Examples with R-Package stringr…

November 3, 2011
By
Some Simple but Propably Useful Regex Examples with R-Package stringr…

I found that examples for the use of regex in R are rather rare. Thus, I will provide some examples from my own learning materials - mostly stolen from the help pages, with small but maybe illustrative adaptions.ps: I will extent this list of examples...

Read more »

First thoughts on R

November 2, 2011
By
First thoughts on R

Having worked just a little with R, I have some first impressions to share.  I'll give you some links to resources I found helpful with writing the previous project. First, the documentation is not very good.  I struggled on previous attempts to figure things out.  I still find it crap shoot when I Google, looking for an answer....

Read more »