### Which functions in plyr do people use?

This is the question that Hadley Wickham recently set out to discovering by asking frequent R and plyr users how they use it in an online survey. Once a decent number of [Read more...]

As I’ve discussed here before, there is a debate raging (ok, maybe not raging) about terms such as “data science”, “analytics”, “data mining”, and “big data”. What do they mean, how do they overlap, and perhaps most importantly, who are the people who work in these fields? Along with ... [Read more...]

Many thanks to all who participated in the survey about writing R package vignettes.Following my post last Thursday the responses came in quickly in the evening and all day on Friday. Since Saturday the response rate has been decreasing constantly and ...

[Read more...] I am currently co-writing the vignette for the ChainLadder package and wonder what I should be focusing on. I have co-written the vignette of the googleVis package in the past and based it purely and what I thought would work. So, this is an experiment...

[Read more...]Just over two weeks ago, I invited readers to complete the Open Governance Index (OGI) Questionnaire regarding The R Project. The OGI evaluates several facets of governance in open source projects (OGI publication). The OGI questionnaire is reproduced below, and each question is linked from the table of useR responses. ... [Read more...]

With Jean-Michel Marin, Pierre Pudlo and Robin Ryder, we just completed a survey on the ABC methodology. It is now both arXived and submitted to Statistics and Computing. Rather interestingly, our first draft was written in Jean-Michel’s office in Montpelier by collating the ‘Og posts surveying new ABC papers! (...

The correlation coefficient is a measurement of correlation between two random variables.
While its computation is straightforward, it is not readily applicable to
non-parametric statistics.
Andrew Gelman wrote today about some erroneous U.S. Governor approval ratings, noting that the ratings for Janet Napolitano sum to 108%. In fact most of these ratings do not sum to 100%. I prepared a clean CSV file of the ratings, making use of R‘s XML library and the readHTMLTable ... [Read more...]

When analyzing a questionnaire, one often wants to view the correlation between two or more Likert questionnaire item’s (for example: two ordered categorical vectors ranging from 1 to 5).
When dealing with several such Likert variable’s, a clear presentation of all the pairwise relation’s between our variable can be ...

[Read more...]Over a month ago, David Smith published a call for people to participate in the “Future of Open Source” Survey. 550 people (and me) took the survey, and today I got an e-mail with the news that the 2010 survey results are analysed and where published in the “Future.Of.Open.Source ... [Read more...]

I guess this is not the number one post I would like to start with on this blog, but I feel the time is right for it (community-wise).
I’ll move on to the subject matter in a moment, but first a short intro: This blog is written by Tal ...

