The UK Guardian Data Blog has great visualizations on the topics of the day - along with with specific references to data sets and online resources in use. You can find out more about the origins and plans of this and related data sites in t...

I've uploaded version 0.2-1 of my bibtex package to CRAN. This release anticipates changes in R 2.12.0, and structures bibtex entries in object of the new class bibentry. The release also fixes various parser and lexer bugs

When writing a presentation we might want to use a bullet list to highlight some key points that might be lost if they are part of a large body of text. We can use the standard LaTeX environments for creating lists within a beamer presentation in a straightforward way. Fast Tube by Casper The bullet lists can

Now that the 2010 survey is over, you might be wondering what we can learn from the data when the aggregated results are published. For a good guide to the kinds of questions you'll be able to answer, take a look at StatJump, where you can see tables and charts of the results of the 2000 census: population data...

Les estivales 2010 ont commencées à montpellier.

Andrew Gelman wrote today about some erroneous U.S. Governor approval ratings, noting that the ratings for Janet Napolitano sum to 108%. In fact most of these ratings do not sum to 100%. I prepared a clean CSV file of the ratings, making use of R‘s XML library and the readHTMLTable function. The ratings data file

In math and economics, there is a long, proud history of placing imaginary prisoners into nasty, complicated scenarios. We have, of course, the classic Prisoner’s Dilemma, as well as 100 prisoners and a light bulb. Add to that list the focus of this post, 100 prisoners and 100 boxes. In this game, the warden places

At the moment I’m working on the implementation of full block designs (e.g., every member of group A rates each member from group and vice versa. A typical example is speed dating: every man rates each woman and vice versa). These designs can be analyzed with mixed effect models, and now I’m a bit confused

We are making slow progress on the normal and regression chapters as we decided to write the package at the same time we revise the chapters… Jean-Michel transformed the variable selection and model choice R codes of the regression chapter into generic functions that will fit within the package. I rewrote the section on testing,

Here's an interesting competition that may well lend itself to R: the IEEE International Conference on Data Mining is running a contest to find the best way of predicting traffic problems. There are three separate contests: Predicting congestion: a series of measurements from 10 selected road segments is given and the goal is to make short-term predictions of future...

A little piece of code dealing with the subsampling of matrices, in R. Useful if you want to use something akin to bootstrap, or just check the size of your sample with regard to various statistics.

Progress on the ggplot2 user interface is coming along. Please check out this VLOG which will give a good idea of where I currently am in the process. http://neolab.stat.ucla.edu/cranstats/vlog2.mov As always comments and suggestions are welcome. If you would like to try it out yourself, you can install the development version of Deducer using install.packages("Deducer",,"http://www.rforge.net",type="source")

The presentations from April's successful R/Finance 2010 conference in Chicago are now available online. (Revolution Analytics is a proud sponsor of the conference.) There's some amazing content here for anyone looking for the cutting-edge of financial engineering, with presentations from practitioners and researchers at institutions like Invesco Asset Management, Black Mesa Capital, and some of the leading academic institutions...

At the request of a commenter I just wanted to clarify that any code released here for R or anything else is free and open source unless specifically stated otherwise. The open source BSD license for any code on GGD can be found on this copyright page.

Following discussion on the comments of the previous post, I thought about how it was possible to draw links going in several directions (i.e. there are no ‘clear’ differences between the levels, and species from level n can interact with species of level n, n+1, n-1, n±k, etc). This is now done, with a code

Jean-Michel Marin and myself have thus started our “research in pair” in CIRM, Luminy, for a fortnight. We are working on the second edition of Bayesian Core and, despite working round the clock on the project (except for a one hour run around Mont Puget this morning), we are not going as fast as planned…

In the last few weeks, I started focusing more on more of trophic systems with three levels or more. I wanted something to visualize the resulting trophic network, so I came up with a little R function called draw.mnet (which stands for draw a multiple-levels network).

New Zealand's Sunday Star Times last weekend featured a profile on Ross Ihaka, co-creator of R: The down-to-earth associate statistics professor and his fellow researcher Robert Gentleman are famous around the world for developing R programming – a "glorified calculator" that crunches data. R programming allows for statistical computing and graphics and is used by thousands of companies worldwide...

Hadley Wickham has announced that new versions of his popular grammar-of-graphics charting package ggplot2 and his general-purpose data reshaping tool plyr for R are now available. plyr boasts several new features, most notably a new join function which should simplify what can sometimes be a difficult process in R: merging two data sets. A simplified SQL-like terminology should make...