Articles by richierocks

Adding metadata to variables

January 6, 2012 | richierocks

There are only really two ways to preserve your statistical analyses. You either save the variables that you create, or you save the code that you used to create them. In general the latter is much preferred because at some point you’ll realise that your model was wrong, or ... [Read more...]

A quick primer on split-apply-combine problems

December 16, 2011 | richierocks

I’ve just answered my hundred billionth question on Stack Overflow that goes something like I want to calculate some statistic for lots of different groups. Although these questions provide a steady stream of easy points, its such a common and basic data analysis concept that I thought it would ... [Read more...]

Interactive graphics for data analysis

September 1, 2011 | richierocks

I got a copy of Martin Theus and Simon Urbanek’s Interactive Graphics for Data Analysis a couple of years ago, whence it’s been sat on my bookshelf. Since I’ve recently become a self-proclaimed expert on interactive graphics I thought it was about time I read the thing. ... [Read more...]

Nomograms everywhere!

August 30, 2011 | richierocks

At useR!, Jonty Rougier talked about nomograms, a once popular visualisation that has fallen by the wayside with the rise of computers. I’d seen a few before, but hadn’t understood how they worked or why you’d want to use them. Anyway, since that talk I’ve been ... [Read more...]

Anonymising data

August 23, 2011 | richierocks

There are only three known jokes about statistics in the whole universe, so to complete the trilogy (see here and here for the other two), listen up: Three statisticians are on a train journey to a conference, and they get chatting to three epidemiologists who are also going to the ... [Read more...]

More useless statistics

August 22, 2011 | richierocks

Over at the ExploringDataBlog, Ron Pearson just wrote a post about the cases when means are useless. In fact, it’s possible to calculate a whole load of stats on your data and still not really understand it. The canonical dataset for demonstrating this (spoiler alert: if you are doing ... [Read more...]

useR2011 highlights

August 18, 2011 | richierocks

useR has been exhilarating and exhausting. Now it’s finished, I wanted to share my highlights. 10. My inner twelve year old schoolgirl swooning and fainting with excitement every time I chatted with a member of R-core. 9. Patrick Burns declaring that his company consists of himself and his two cats. And ... [Read more...]

useR2011 Easy interactive ggplots talk

August 17, 2011 | richierocks

I’m talking tomorrow at useR! on making ggplots interactive with the gWidgets GUI framework. For those of you at useR, here is the code and data, so you can play along on your laptops. For everyone else, I’ll make the slides available in the next few days so ... [Read more...]

Stop! (In the name of a sensible interface)

August 12, 2011 | richierocks

In my last post I talked about using the number of lines in a function as a guide to whether you need to break it down into smaller pieces. There are many other useful metrics for the complexity of a function, most notably cyclomatic complexity, which tracks the number of ... [Read more...]

Monster functions (Raaargh!)

August 12, 2011 | richierocks

It’s widely considered good programming practice to have lots of little functions rather than a few big functions. The reasons behind this are simple. When your program breaks, it’s much nicer to debug a five line function than a five hundred line function. Additionally, by breaking up your ... [Read more...]

The Stats Clinic

July 27, 2011 | richierocks

Here at HSL we have a lot of smart kinda-numerate people who have access to a lot of data. On a bad day, kinda-numerate includes myself, but in general I’m talking about scientists who have have done an introductory stats course, but not much else. When all you have ... [Read more...]

The method in the mirror: reflection in R

July 17, 2011 | richierocks

Reflection is a programming concept that sounds scarier than it is. There are three related concepts that fall under the umbrella of reflection, and I’ll be surprised if you haven’t come across most of these code ideas already, even if you didn’t know it was called reflection. ... [Read more...]

Testing for valid variable names

July 3, 2011 | richierocks

I have something a fondness for ridiculous variable names, so it’s useful to be able to check whether my latest concoction is legitimate. More so if it is automatically generated. Not having an is_valid_variable_name function is one of those odd omissions from R, and the assign ... [Read more...]

Tracking execution paths

June 18, 2011 | richierocks

Earlier this week, I was trying to figure out the path of execution through a big chunk of code. Once you reach a certain size of codebase, tracking which function gets called when can be tricky. My first thought for dealing with this was to add a message line at ... [Read more...]

A clock utility, via console hackery

May 11, 2011 | richierocks

A discussion on StackOverflow today shows an interesting use of special characters inside the cat function. The most common special characters that you may have come across are the tab and newline characters, represented by \t and \n respectively. Try them for yourself. cat("Red\tlorry\nYellow\tlorry\n") cat ... [Read more...]

Friday Function: nclass

May 6, 2011 | richierocks

When you draw a histogram, an important question is “how many bar should I draw?”. This should inspire an indignant response. You didn’t become a programmer to answer questions, did you? No. The whole point of programming is to let your computer do your thinking for you, giving you ... [Read more...]

(Almost) Friday Function: alarm

April 21, 2011 | richierocks

Last week I decided to start a weekly column detailing an interesting function each Friday, entirely forgetting that I would be on holiday, without internet access (shock horror!), tomorrow. So here’s your column a little early. The alarm function is something of a novelty, in that all it does ... [Read more...]

supercalifragilisticexpialidocious = 1

April 21, 2011 | richierocks

I notice that the latest version of R has upped the maximum length of variable names from 256 characters to a whopping 10 000! (See ?name.) It makes the 63 character limit in MATLAB look rather pitiful by comparison. Come on MathWorks! Let’s have the ability to be stupidly verbose in our variable ... [Read more...]

Non-standard assignment with getSymbols

April 21, 2011 | richierocks

I recently came across a rather interesting investment blog, Timely Portfolio. I have a certain soft spot for that sort of thing, because using my data analysis skills to make a fortune is casually on my to-do list. This blog makes regular use of a function getSymbols in the quantmod ... [Read more...]
1 2 3 4

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)