Blog Archives

Exploring the functions in a package

January 26, 2012
By
Exploring the functions in a package

Sometimes it can be useful to list all the functions inside a package. This is done in the same way that you would list variables in your workspace. That is, using ls. The syntax is ls(pos = "package:packagename"), which is easy enough if you can remember it. Unfortunately, I never can, and have to type

Read more »

A quick primer on split-apply-combine problems

December 16, 2011
By
A quick primer on split-apply-combine problems

I’ve just answered my hundred billionth question on Stack Overflow that goes something like I want to calculate some statistic for lots of different groups. Although these questions provide a steady stream of easy points, its such a common and basic data analysis concept that I thought it would be useful to have a document

Read more »

Interactive graphics for data analysis

September 1, 2011
By
Interactive graphics for data analysis

I got a copy of Martin Theus and Simon Urbanek’s Interactive Graphics for Data Analysis a couple of years ago, whence it’s been sat on my bookshelf. Since I’ve recently become a self-proclaimed expert on interactive graphics I thought it was about time I read the thing. Which is exactly what I did last weekend

Read more »

Nomograms everywhere!

August 30, 2011
By
Nomograms everywhere!

At useR!, Jonty Rougier talked about nomograms, a once popular visualisation that has fallen by the wayside with the rise of computers. I’d seen a few before, but hadn’t understood how they worked or why you’d want to use them. Anyway, since that talk I’ve been digging around in biology books from the 60s and

Read more »

Anonymising data

August 23, 2011
By
Anonymising data

There are only three known jokes about statistics in the whole universe, so to complete the trilogy (see here and here for the other two), listen up: Three statisticians are on a train journey to a conference, and they get chatting to three epidemiologists who are also going to the same place. The epidemiologists are

Read more »

More useless statistics

August 22, 2011
By
More useless statistics

Over at the ExploringDataBlog, Ron Pearson just wrote a post about the cases when means are useless. In fact, it’s possible to calculate a whole load of stats on your data and still not really understand it. The canonical dataset for demonstrating this (spoiler alert: if you are doing an intro to stats course, you

Read more »

useR2011 highlights

August 18, 2011
By
useR2011 highlights

useR has been exhilarating and exhausting. Now it’s finished, I wanted to share my highlights. 10. My inner twelve year old schoolgirl swooning and fainting with excitement every time I chatted with a member of R-core. 9. Patrick Burns declaring that his company consists of himself and his two cats. And that one of the

Read more »

useR2011 Easy interactive ggplots talk

August 17, 2011
By
useR2011 Easy interactive ggplots talk

I’m talking tomorrow at useR! on making ggplots interactive with the gWidgets GUI framework. For those of you at useR, here is the code and data, so you can play along on your laptops. For everyone else, I’ll make the slides available in the next few days so you can see what you missed. Note

Read more »

Stop! (In the name of a sensible interface)

August 12, 2011
By
Stop! (In the name of a sensible interface)

In my last post I talked about using the number of lines in a function as a guide to whether you need to break it down into smaller pieces. There are many other useful metrics for the complexity of a function, most notably cyclomatic complexity, which tracks the number of different routes that code can

Read more »