At the Joint Statistical Meetings (Aug 2011), accepting the Roger Herriot Award for Innovation in Federal Statistics, I tipped my hat to pen-source software and three mentors. I use the software (R, OpenBUGS, and MediaWiki) every d...

At the Joint Statistical Meetings (Aug 2011), accepting the Roger Herriot Award for Innovation in Federal Statistics, I tipped my hat to pen-source software and three mentors. I use the software (R, OpenBUGS, and MediaWiki) every d...

I doubt if anyone would deny the importance of being able to reproduce one's econometric results. More importantly, other researchers should be able to reproduce our results to verify (a) that we've done what we said we did; (b) to investigate the sensitivity of our results to the various choices we made (e.g., functional form of our model, choice...

A guest post by Paul Hiemstra. ———— Fortran and C programmers often say that interpreted languages like R are nice and all, but lack in terms of speed. How fast something works in R greatly depends on how it is implemented, i.e. which packages/functions does one use. A prime example, which shows up regularly on

Here is an email I received from Umberto: I have a doubt regarding the tempered transitions method you considered in your JASA article with Celeux and Hurn. On page 961 you detail the several steps for building a proposal for a given distribution by simulating through l tempered power densities. I am slightly confused regarding

The title says “things” but conferences are mainly about people. Some of it can be serendipitous. For example, one day I sat next to Jonathan Rougier at lunch because I had a question for him about climate models. When Jonathan left, I started a conversation with the person on my other side. That was most … Continue reading...

I’ll be giving a talk on Forecasting time series using R for the Melbourne Users of R Network (MelbURN) on Thursday 27 October 2011 at 6pm. I will look at the various facilities for time series forecasting available in R, concentrating on the forecast package. This package implements several automatic methods for forecasting time series

There's a new local R user group in Salt Lake City, based at the University of Utah. (There used to be another group in Salt Lake devoted to R/Weka/Processing, but it appears to now be defunct.) This new group has been meeting regularly for some time, and their next meeting, on September 9, will be devoted to short talks...

For once, here is a book review I wrote in French about the book Le logiciel R, written by Pierre Lafaye de Micheaux (Université de Montréal), Rémy Drouilhet (Université de Grenoble 2) and Benoît Liquet (Université de Bordeaux 2): Ce livre édité par Springer (dans la même collection que Le Choix Bayesien) propose une couverture

Just over two weeks ago, I invited readers to complete the Open Governance Index (OGI) Questionnaire regarding The R Project. The OGI evaluates several facets of governance in open source projects (OGI publication). The OGI questionnaire is reproduced below, and each question is linked from the table of useR responses. The table below presents the

In a tongue-in-cheek post at the Information Management blog, Steve Miller shares his "frustration" with R: package developers keep on releasing new functionality for R that makes his own work obsolete. For example, there's now pre-packaged functionality in R for enhanced dotplots, Economist-style graphics, additive regression models and more, which all obviate the need for Steve to implement such...

“It seems quite absurd to reject an EP-based approach, if the only alternative is an ABC approach based on summary statistics, which introduces a bias which seems both larger (according to our numerical examples) and more arbitrary, in the sense that in real-world applications one has little intuition and even less mathematical guidance on to

In the last Utah R Users group meeting I gave a presentation on data manipulations on R, and today I found through the plyr mailing list two commands that I was previously unaware of that should definitely be made mention of, arrage and mutate.

User BobH asked on StackOverflow about accelerating path-dependent loops. He provided a simple example in which a vector gets filled conditional on the value of the preceding element. Simple to code, but hard to vectorise. By the time I saw that q...

There are only three known jokes about statistics in the whole universe, so to complete the trilogy (see here and here for the other two), listen up: Three statisticians are on a train journey to a conference, and they get chatting to three epidemiologists who are also going to the same place. The epidemiologists are

Time series data are widely seen in analytics. Some examples are stock indexes/prices, currency exchange rates and electrocardiogram (ECG). Traditional time series analysis focuses on smoothing, decomposition and forecasting, and there are many R functions and packages available for those … Continue reading →

The usual approach to testing software is to create a specific problem and see if the software gets the correct answer. Although this is very useful, there are problems with it: It is labor-intensive It almost totally neglects to test the code that throws errors There can be unconscious bias in the test cases created … Continue reading...

Hong Ooi talks about some of the more interesting projects that he has used R for in the last year. These include fitting models for mortgage loss given default, a Monte Carlo application for stress-testing loan portfolios (in combination with Excel an...

No doubt you've heard about the tyranny of the 9s in reference to computer system availability. You're probably also familiar with the phrase six sigma, either in the context of manufacturing process quality control or the improvement of business processes. As we discovered in the recent Guerrilla Data Analysis Techniques class, the two concepts are related.