Merging Data Sets Based on Partially Matched Data Elements

September 26, 2012 | 0 Comments

A tweet from @coneee yesterday about merging two datasets using columns of data that don’t quite match got me wondering about a possible R recipe for handling partial matching. The data in question related to country names in a datafile that needed fusing with country names in a listing ... [Read more...]

More Dabblings With Local Sentencing Data

December 1, 2011 | 0 Comments

In Accessing and Visualising Sentencing Data for Local Courts I posted a couple of quick ways in to playing with Ministry of Justice sentencing data for the period July 2010-June 2011 at the local court level. At the end of the post, I wondered about how to wrangle the data in ... [Read more...]

Accessing and Visualising Sentencing Data for Local Courts

November 29, 2011 | 0 Comments

A recent provisional data release from the Ministry of Justice contains sentencing data from English(?) courts, at the offence level, for the period July 2010-June 2011: “Published for the first time every sentence handed down at each court in the country between July 2010 and June 2011, along with the age and ethnicity ... [Read more...]

Getting Started With Twitter Analysis in R

November 9, 2011 | 0 Comments

Earlier today, I saw a post vis the aggregating R-Bloggers service a post on Using Text Mining to Find Out What @RDataMining Tweets are About. The post provides a walktrhough of how to grab tweets into an R session using the twitteR library, and then do some text mining on ... [Read more...]

How Might Data Journalists Show Their Working? Sweave

November 1, 2011 | 0 Comments

If part of the role of data journalism is to make transparent the justification behind claims that are, or aren’t, backed up by data, there’s good reason to suppose that the journalists should be able to back up their own data-based claims with evidence about how they made ... [Read more...]

Power Tools for Aspiring Data Journalists: R

October 31, 2011 | 0 Comments

Picking up on Paul Bradshaw’s post A quick exercise for aspiring data journalists which hints at how you can use Google Spreadsheets to grab – and explore – a mortality dataset highlighted by Ben Goldacre in DIY statistical analysis: experience the thrill of touching real data, I thought I’d describe ... [Read more...]

Using Google Spreadsheets as a Database Source for R

September 2, 2011 | 0 Comments

I couldn’t contain myself (other more pressing things to do, but…), so I just took a quick time out and a coffee to put together a quick and dirty R function that will let me run queries over Google spreadsheet data sources and essentially treat them as database tables (... [Read more...]

