project euler – problem 47

November 8, 2011
By

The first two consecutive numbers to have two distinct prime factors are: 14 = 2 × 7 Read More: 278 Words Totally

Read more »

project euler – Problem 44

November 8, 2011
By

Pentagonal numbers are generated by the formula, Pn=n(3n−1)/2. The first ten pentagonal numbers are: 1, 5, 12, 22, 35, 51, 70, 92, 117, 145, ... Read More: 472 Words Totally

Read more »

Drawing polar centered spatial maps using ggplot2

November 8, 2011
By
Drawing polar centered spatial maps using ggplot2

Drawing maps of the polar regions can be done using square spatial maps. A small example says more than a thousand words: xlim = c(-180,180) ylim = c(60,90)   # Some fake grid data dat_grid = expand.grid(x = xlim[1]:xlim[2], y… See more ›

Read more »

The mystery of volatility estimates from daily versus monthly returns

November 8, 2011
By
The mystery of volatility estimates from daily versus monthly returns

What drives the estimates apart? Previously A post by Investment Performance Guy prompted “Variability of volatility estimates from daily data”. In my comments to the original post I suggested that using daily data to estimate volatility would be equivalent to using monthly data except with less variability.  Dave, the Investment Performance Guy, proposed the exquisitely … Continue reading...

Read more »

Doing away with “unknown timezone” warnings

November 8, 2011
By
Doing away with “unknown timezone” warnings

Timezone stuff can really drive you NUTS - at least if you’re sitting in front of a German Windows-Box This is what I used to do to set my tz: And I always wondered why R would throw “unknown timezone” warnings: Someday I found out that setting tz via `options()` was not enough as the … Continue reading...

Read more »

project euler – Problem 32

November 8, 2011
By

We shall say that an n-digit number is pandigital if it makes use of all the digits 1 to n exactly once; for example, the 5-digit number, 15234, is 1 through 5 pandigital. The product 7254 is unusual, as the identity, 39 × 186 = 7254, containing multiplicand, multiplier, and product is 1 through...

Read more »

project euler – Problem 31

November 8, 2011
By

In England the currency is made up of pound, £, and pence, p, and there are eight coins in general circulation: 1p, 2p, 5p, 10p, 20p, 50p, £1 (100p) and £2 (200p). Read More: 299 Words Totally

Read more »

project euler – Problem 15

November 8, 2011
By
project euler – Problem 15

Starting in the top left corner of a 2x2 grid, there are 6 routes (without backtracking) to the bottom right corner. How many routes are there through a 20x20 grid? Read More: 293 Words Totally

Read more »

project euler-Problem 43

November 7, 2011
By

The number, 1406357289, is a 0 to 9 pandigital number because it is made up of each of the digits 0 to 9 in some order, but it also has a rather interesting sub-string divisibility property. Let d1 be the 1st digit, d2 be the 2nd digit, and so on. In this way, we note the...

Read more »

ABC on wordpress

November 7, 2011
By
ABC on wordpress

Erkan Buzbas sent me an email about his webpage (operated as a wordpress blog) on ABC. It contains different items of information on ABC research and an hopefully growing list of references. After Scott Sisson’s tweet on ABC_research (latest news: two ABC sessions in ISBA 20122, Kyoto),  here comes another way to keep posted about

Read more »

Webinar Nov 17: What’s new in Revolution R Enterprise 5.0

November 7, 2011
By

Revolution R Enterprise 5.0 will be released soon, and Sue Ranney, VP of Development at Revolution Analytics, will host a webinar on Thursday November 17 to get you up to speed on the latest features: Revolution R Enterprise 5.0 is Revolution Analytics’ scalable analytics platform. At its core is Revolution Analytics’ enhanced Distribution of R, the world’s most widely-used...

Read more »

Coming out of the (Bayesian) closet: multivariate version

November 7, 2011
By
Coming out of the (Bayesian) closet: multivariate version

This week I’m facing my—and many other lecturers’—least favorite part of teaching: grading exams. In a supreme act of procrastination I will continue the previous post, and the antepenultimate one, showing the code for a bivariate analysis of a randomized … Continue reading →

Read more »

Web Scraping Google URLs

November 7, 2011
By
Web Scraping Google URLs

Google slightly changed the html code it uses for hyperlinks on search pages last Thursday, thus causing one of my scripts to stop working. Thankfully, this is easily solved in R thanks to the XML package and the power and simplicity of XPath expressions: Lovely jubbly! P.S. I know that there is an API of

Read more »

Code Optimization: One R Problem, Eleven Solutions – Now Thirteen!

November 7, 2011
By
Code Optimization: One R Problem, Eleven Solutions – Now Thirteen!

Following up from my previous post “Code Optimisation: One R Problem, Ten Solutions – Now Eleven!” I figured out a twelfth solution after writing that blog post. Furthermore, half way through writing this blog post I figured out a thirteenth solution too. As a recap, the problem is taken from rwiki where the goal is to find

Read more »

project euler-Problem 41

November 7, 2011
By

We shall say that an n-digit number is pandigital if it makes use of all the digits 1 to n exactly once. For example, 2143 is a 4-digit pandigital and is also prime. What is the largest n-digit pandigital prime that exists? Read More: 288 Words Totally

Read more »

Bayesian modeling using WinBUGS

November 6, 2011
By
Bayesian modeling using WinBUGS

Yes, yet another Bayesian textbook: Ioannis Ntzoufras’ Bayesian modeling using WinBUGS was published in 2009 and it got an honourable mention at the 2009 PROSE Award. (Nice acronym for a book award! All the mathematics books awarded that year were actually statistics books.) Bayesian modeling using WinBUGS is rather similar to the more recent Bayesian

Read more »

Rcpp talk at Seattle RUG next month

November 6, 2011
By

The Seattle R User Group was kind enough to invite me to give a talk about R, C++ and Rcpp. So if you can make it to the Thomas building of the Fred Hutchinson Cancer Research Center in Seattle, WA, on December 7, I would love to see you there. I ha...

Read more »

More colour wheels

November 5, 2011
By
More colour wheels

In response to my post about colour wheels, I received a suggested enhancement from Drew. The idea is to first match colours based on the text provided and then add nearby colours. This can be done by ordering colours in terms of hue, saturation, and value. The result is a significant improvement and it will capture all of

Read more »

Kaplan-Meier Survival Plot – with at risk table

November 5, 2011
By
Kaplan-Meier Survival Plot – with at risk table

Credit for the bulk of this code is to Abhijit Dasgupta and the commenters on the original post here from earlier this year. I have made a few changes to the functionality of this which I think warrant sharing. A brief … Continue reading →

Read more »

Next Level Web Scraping

November 5, 2011
By
Next Level Web Scraping

The outcome presented above will not be very useful to most of you - still, this could be a good example for what possibly can be done via web scraping in R.Background: TIRIS is the federal geo-statistical service of North-Tyrol, Austria. One of many g...

Read more »

Vectors (CloudStat)

November 5, 2011
By

The simplest type of data object in R is a vector, which is simply an ordered set of values. Some further examples of creating vectors are shown below: Input: 1:20 Output: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 This creates...

Read more »

#2 Data Classes (CloudStat)

November 5, 2011
By

As stated in CloudStat Intro, we know that CloudStat is based on R Language, an object orientated language, everything in R is an object. Each object has a class. The simplest data objects are one-dimensional arrays called vectors, consisting of any nu...

Read more »

The Joy of R: A Feline Guide

November 5, 2011
By
The Joy of R: A Feline Guide

Just because it’s caturday Images by Mario Pineda-Krch (CC BY-NC-SA 3.0) This is from the “Mario’s Entangled Bank” blog ( http://pineda-krch.com ) of Mario Pineda-Krch, a theoretical biologist at the University of Alberta. Filed under: cats, computing, humour, R, Sweave

Read more »

Colour wheels in R

November 5, 2011
By
Colour wheels in R

Regular readers will know I use the R package to produce most of the charts that appear here on the blog. Being more quantitative than artistic, I find choosing colours for the charts to be one of the trickiest tasks when designing a chart, particularly as R has so many colours to choose from. In

Read more »

Data Referenced Journalism and the Media – Still a Long Way to Go Yet?

November 4, 2011
By
Data Referenced Journalism and the Media – Still a Long Way to Go Yet?

Reading our local weekly press this evening (the Isle of Wight County Press), I noticed a page 5 headline declaring “Alarm over death rates at St Mary’s”, St Mary’s being the local general hospital. It seems a Department of Health report on hospital mortality rates came out earlier this week, and the Island’s hospital, it

Read more »

Unit root versus breaking trend: Perron’s criticism

November 4, 2011
By
Unit root versus breaking trend: Perron’s criticism

I came across an ingenious simulation by Perron during my Time-series lecture which I thought was worth sharing. The idea was to put your model to a further test of breaking trend before accepting the null of unit root. Let me try and illustrate this in simple language. A non-stationary time series is one that has its mean changing...

Read more »

Generating PPC Keywords in R – Part 2

November 4, 2011
By

In a previous post, I discussed how to generate PPC keywords in R. In this post I will provide another example of how to perform this task. Let’s say that I am a auto insurance company that only operates in the state of Illinois. I’m planing on bidding on keywords in Bing and Google which

Read more »

Rdatamarket Tutorial

November 4, 2011
By

The good folks at DataMarket have posted a new tutorial on using the rdatamarket package (covered here in August) to easily download public data sets into R for analysis. The tutorial describes how to install the rdatamarket package, how to extract metadata for data sets, and how to download the data themselves into R. The tutorial also illustrates a...

Read more »

match vs. %in%

November 4, 2011
By

match and %in% are two very commonly-used function in R. So, what's the difference of them?First, how to use them -- (copy from R manual)match returns a vector of the positions of (first) matches of its first argument in its second.%in% is a ...

Read more »