Last week I went to the “Government Open Data Hack Day” (godhd on twitter) in Birmingham (UK), organised by Gavin Broughton and James Catell. The idea was to get hold of local open data and try and make use of … Continue reading →

It is quite common in political science for researchers to run statistical models, find that a coefficient for a variable is not statistically significant, and then claim that the variable "has no effect." This is equivalent to proposing a research ...

I just returned from the useR! 2012 conference for developers and users of R. One of the common themes to many of the presentations was integration of R-based statistical systems with other systems, be they other programming languages, web systems, or enterprise data systems. Some highlights for me were an update to Rserve that includes

Yet another full day working on Bayesian Core with Jean-Michel in Carnon… This morning, I ran along the canal for about an hour and at last saw some pink flamingos close enough to take pictures (if only to convince my daughter that there were flamingos in the area!). Then I worked full-time on the spatial

From Wiki:"... the bottom and top of the box are always the 25th and 75th percentile (the lower and upper quartiles, respectively), and the band near the middle of the box is always the 50th percentile (the median). But the ends of the whiskers can represent several possible alternative values..."In R's default boxplot{graphics} code,upper whisker =...

From Wiki:"... the bottom and top of the box are always the 25th and 75th percentile (the lower and upper quartiles, respectively), and the band near the middle of the box is always the 50th percentile (the median). But th...

Just finish my last assignment for this week. IT’S WEEKEND, officially. Let me take a break to have a look at the tenth problem, another prime problem. It’s no doubt that prime is the center of the number theory and fundamental … Continue reading →

Recently British government (by Office of National Statistics: ONS) just published their version of R manual for analysis of the government survey. The links to PDF and MS word versions of the manual including the R syntax are as below. Note: The R syntax link is not working now. I am contacting the ONS, hope

My girlfriend’s biological clock is ticking, and so we’ve started trying to spawn. Since I’m impatient, that has naturally lead to questions like “how long will it take?”. If I were to believe everything on TV, the answer would be easy: have unprotected sex once and pregnancy is guaranteed. A more cynical me suggests that

Forgive me if you are already aware of this, but I found it quite alarming. I know that most code is interpreted by the computer in binary and we input in decimal, so problems can arise in conversion and with floating point. But the example I have below is so simple that it really surprised me.I was converting...

Last week, Joe Rickert used R and four years of US Census data to create an image plot of the relative probabilities of being born on a given day of the year: Chris Mulligan also tackled this problem with R, but this time using 20 years of Census data from 1969 to 1988. Chris extracted the birthday frequencies using...

Where do these come from? Since most statistical packages calculate these estimates automatically, it is not unreasonable to think that many researchers using applied econometrics are unfamiliar with the exact details of their computation. For the purposes of illustration, I am going to estimate different standard errors from a basic linear regression model: , using the

Just a quick note that I’ve posted the slides, code, and dataset from my useR 2012 talk. I’m having a great time here in Nashville and will write up a conference review soon, with links to the many excellent packages … Continue reading →

R is an incredibly powerful programming tool used by a large community of people who require an easy to use, light weight, FREE, and fundamentally awesome statistics package! I have recently discovered that R can be used by geologists whose IT skills are often pitied by their more computer literate geophysicist colleagues. Long story short

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full June edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Revolution R Enterprise 6 Now Available! The latest release of Revolution R Enterprise brings...

Some interesting ggplot2 tutorials for the social sciences @ Code à la Mode

Many R packages/tools have come out recently for doing ecology and evolution. All of the below were described in Methods in Ecology and Evolution, except for spider, which came out in Molecular Ecology Resources. Here are some highlights. mvabund pap...

Is it just me, or does the performance of the foreach package with a doSNOW backend operating on a socket grid suck?Here at work, I am helping to setup a cluster of Windows machines for distributed R processing. We have lots of researchers runnin...

Here at work I've been in the business of developing webapps using R as the backend computational framework. The list of parts to get this running is pretty lightweight, just:R Apache 2rApacheI'm not going to cover how to set these things up here...

I had such a blast presenting my tutorial on Rook yesterday. Thanks go out to all who attended! All the slides are online here and I’ll be updating my RookTutorial github project with all the great suggestions I got from the attendees. Also, check back soon as I’m planning more postings on Rook. Cheers!

AbstractVarious approaches exist to relate saturated hydraulic conductivity (Ks) to grain-size data. Most methods use a single grain-size parameter and hence omit the information encompassed by the entire grain-size distribution. This study compares two data-driven modelling methods—multiple linear regression and artificial neural networks—that use the entire grain-size distribution data as input for Ks prediction. Besides the predictive capacity of the methods,...