Monthly Archives: November 2011

Web Scraping Google Scholar (Partial Success)

November 8, 2011
By

I wanted to scrape the information returned by a Google Scholar web search into an R data frame as a quick XPath exercise. The following will successfully extract  the ‘title’, ‘url’ , ‘publication’ and ‘description’.  If any of these fields are not available, as in the case of a citation, the corresponding cell in the data

Read more »

Bridge and Torch problem in R

November 8, 2011
By
Bridge and Torch problem in R

A couple months ago I came across the bridge and torch problem at a careers fair in Oxford. A young tech company called QuBit used it as a brain teaser challenge for would be software engineers to solve before submitting … Continue reading →

Read more »

Example 9.13: Negative binomial regression with proc mcmc

November 8, 2011
By
Example 9.13: Negative binomial regression with proc mcmc

In practice, data that derive from counts rarely seem to be fit well by a Poisson model; one more flexible alternative is a negative binomial model. In this SAS-only entry, we discuss how proc mcmc can be used for estimation. An overview of support f...

Read more »

Blankety Blank

November 8, 2011
By
Blankety Blank

The erstwhile big 4 all blanked their opponents last Saturday and a poster on the Guardian wondered when was the previous occasion of such an occurrence. It’s a pretty simple procedure in SQL using a subquery, but in the spirit of learning R, I thought I would tackle the problem in that language, with the

Read more »

project euler – problem 47

November 8, 2011
By

The first two consecutive numbers to have two distinct prime factors are: 14 = 2 × 7 Read More: 278 Words Totally

Read more »

project euler – Problem 44

November 8, 2011
By

Pentagonal numbers are generated by the formula, Pn=n(3n−1)/2. The first ten pentagonal numbers are: 1, 5, 12, 22, 35, 51, 70, 92, 117, 145, ... Read More: 472 Words Totally

Read more »

Drawing polar centered spatial maps using ggplot2

November 8, 2011
By
Drawing polar centered spatial maps using ggplot2

Drawing maps of the polar regions can be done using square spatial maps. A small example says more than a thousand words: xlim = c(-180,180) ylim = c(60,90)   # Some fake grid data dat_grid = expand.grid(x = xlim[1]:xlim[2], y… See more ›

Read more »

The mystery of volatility estimates from daily versus monthly returns

November 8, 2011
By
The mystery of volatility estimates from daily versus monthly returns

What drives the estimates apart? Previously A post by Investment Performance Guy prompted “Variability of volatility estimates from daily data”. In my comments to the original post I suggested that using daily data to estimate volatility would be equivalent to using monthly data except with less variability.  Dave, the Investment Performance Guy, proposed the exquisitely … Continue reading...

Read more »

Doing away with “unknown timezone” warnings

November 8, 2011
By
Doing away with “unknown timezone” warnings

Timezone stuff can really drive you NUTS - at least if you’re sitting in front of a German Windows-Box This is what I used to do to set my tz: And I always wondered why R would throw “unknown timezone” warnings: Someday I found out that setting tz via `options()` was not enough as the … Continue reading...

Read more »

project euler – Problem 32

November 8, 2011
By

We shall say that an n-digit number is pandigital if it makes use of all the digits 1 to n exactly once; for example, the 5-digit number, 15234, is 1 through 5 pandigital. The product 7254 is unusual, as the identity, 39 × 186 = 7254, containing multiplicand, multiplier, and product is 1 through...

Read more »