Recreational Data

Posted on March 21, 2014 by Tony Hirst in R bloggers | 0 Comments

[This article was first published on OUseful.Info, the blog... » Rstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Part of my weekend ritual is to buy the weekend papers and have a go at the recreational maths problems that are Sudoku and Killer. I also look for news stories with a data angle that might prompt a bit of recreational data activity…

In a paper that may or may not have been presented at the First European Congress of Mathematics in Paris, July, 1992, Prof. David Singmaster reflected on “The Unreasonable Utility of Recreational Mathematics”.

To begin with, it is worth considering what is meant by recreational
mathematics.
…
First, recreational mathematics is mathematics that is fun and popular – that is, the problems should be understandable to the interested layman, though the solutions may be harder. (However, if the solution is too hard, this may shift the topic from recreational toward the serious – e.g. Fermat’s Last Theorem, the Four Colour Theorem or the Mandelbrot Set.)

Secondly, recreational mathematics is mathematics that is fun and used as either as a diversion from serious mathematics or as a way of making serious mathematics understandable or palatable. These are the pedagogic uses of recreational mathematics. They are already present in the oldest known mathematics and continue to the present day.
…
These two aspects of recreational mathematics – the popular and the pedagogic – overlap considerably and there is no clear boundary between them and “serious” mathematics.
…
How is recreational mathematics useful?

Firstly, recreational problems are often the basis of serious mathematics. The most obvious fields are probability and graph theory where popular problems have been a major (or the dominant) stimulus to the creation and evolution of the subject. …

Secondly, recreational mathematics has frequently turned up ideas of genuine but non-obvious utility. …

Anyone who has tried to do anything with “real world” data knows how much of a puzzle it can represent: from finding the data, to getting hold of it, to getting it into a state and a shape where you can actually work with it, to analysing it, charting it, looking for pattern and structure within it, having a conversation with it, getting it to tell you one of the many stories it may represent, there are tricks to be learned and problems to be solved. And they’re fun.

An obvious definition [of recreational mathematics] is that it is mathematics that is fun, but almost any mathematician will say that he enjoys his work, even if he is studying eigenvalues of elliptic differential operators, so this definition would encompass almost all mathematics and hence is too general. There are two, somewhat overlapping, definitions that cover most of what is meant by recreational mathematics.

…the two definitions described above.

So how might we define “recreational data”. For me, recreational data activities are, in who or in part, data investigations, involving one or more steps of the data lifecycle (discovery, acquisition, cleaning, analysis, visualisation, storytelling). They are the activities I engage in when I look for, or behind, the numbers that appear in a news story. They’re the stories I read on FullFact, or listen to on the OU/BBC co-pro More or Less; they’re at the heart of the beautiful little book that is The Tiger That Isn’t; recreational data is what I do in the “Diary of a Data Sleuth” posts on OpenLearn.

Recreational data is about the joy of trying to find stories in data.

Recreational data is, or can be, the data journalism you do for yourself or the sense you make of the stats in the sports pages.

Recreational data is a safe place to practice – I tinker with Twitter and formulate charts around Formula One. But remember this: “recreational problems are often the basis of serious [practice]“. The “work” I did around Visualising Twitter User Timeline Activity in R? I can (and do) reuse that code as the basis of other timeline analyses. The puzzle of plotting connected concepts on Wikipedia I described in Visualising Related Entries in Wikipedia Using Gephi? It’s a pattern I can keep on playing with.

If you think you might like to do some doodle of your own with some data, why not check out the School Of Data. Or watch out on OpenLearn for some follow up stories from the OU/BBC co-pro of Hans Rosling’s award winning Don’t Panic

To leave a comment for the author, please follow the link and comment on their blog: OUseful.Info, the blog... » Rstats.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)