Posts Tagged ‘ onlinejournalismblog ’

Merging Data Sets Based on Partially Matched Data Elements

September 26, 2012
By
Merging Data Sets Based on Partially Matched Data Elements

A tweet from @coneee yesterday about merging two datasets using columns of data that don’t quite match got me wondering about a possible R recipe for handling partial matching. The data in question related to country names in a datafile that needed fusing with country names in a listing of ISO country codes. The original

Read more »

Interest Differencing: Folk Commonly Followed by Tweeting MPs of Different Parties

July 6, 2012
By
Interest Differencing: Folk Commonly Followed by Tweeting MPs of Different Parties

Earlier this year I doodled a recipe for comparing the folk commonly followed by users of a couple of BBC programme hashtags (Social Media Interest Maps of Newsnight and BBCQT Twitterers). Prompted in part by a tweet from Michael Smethurst/@fantasticlife about generating an ESP map for UK politicians (something I’ve also doodled before – Sketching

Read more »

More Dabblings With Local Sentencing Data

December 1, 2011
By
More Dabblings With Local Sentencing Data

In Accessing and Visualising Sentencing Data for Local Courts I posted a couple of quick ways in to playing with Ministry of Justice sentencing data for the period July 2010-June 2011 at the local court level. At the end of the post, I wondered about how to wrangle the data in R so that I

Read more »

Accessing and Visualising Sentencing Data for Local Courts

November 29, 2011
By
Accessing and Visualising Sentencing Data for Local Courts

A recent provisional data release from the Ministry of Justice contains sentencing data from English(?) courts, at the offence level, for the period July 2010-June 2011: “Published for the first time every sentence handed down at each court in the country between July 2010 and June 2011, along with the age and ethnicity of each

Read more »

Getting Started With Twitter Analysis in R

November 9, 2011
By
Getting Started With Twitter Analysis in R

Earlier today, I saw a post vis the aggregating R-Bloggers service a post on Using Text Mining to Find Out What @RDataMining Tweets are About. The post provides a walktrhough of how to grab tweets into an R session using the twitteR library, and then do some text mining on it. I’ve been meaning to

Read more »

Data Referenced Journalism and the Media – Still a Long Way to Go Yet?

November 4, 2011
By
Data Referenced Journalism and the Media – Still a Long Way to Go Yet?

Reading our local weekly press this evening (the Isle of Wight County Press), I noticed a page 5 headline declaring “Alarm over death rates at St Mary’s”, St Mary’s being the local general hospital. It seems a Department of Health report on hospital mortality rates came out earlier this week, and the Island’s hospital, it

Read more »

How Might Data Journalists Show Their Working? Sweave

November 1, 2011
By
How Might Data Journalists Show Their Working? Sweave

If part of the role of data journalism is to make transparent the justification behind claims that are, or aren’t, backed up by data, there’s good reason to suppose that the journalists should be able to back up their own data-based claims with evidence about how they made use of the data. Posting links to

Read more »

Power Tools for Aspiring Data Journalists: R

October 31, 2011
By
Power Tools for Aspiring Data Journalists: R

Picking up on Paul Bradshaw’s post A quick exercise for aspiring data journalists which hints at how you can use Google Spreadsheets to grab – and explore – a mortality dataset highlighted by Ben Goldacre in DIY statistical analysis: experience the thrill of touching real data, I thought I’d describe a quick way of analysing

Read more »

Using Google Spreadsheets as a Database Source for R

September 2, 2011
By
Using Google Spreadsheets as a Database Source for R

I couldn’t contain myself (other more pressing things to do, but…), so I just took a quick time out and a coffee to put together a quick and dirty R function that will let me run queries over Google spreadsheet data sources and essentially treat them as database tables (e.g. Using Google Spreadsheets as a

Read more »

Merging Two Different Datasets Containing a Common Column With R and R-Studio

August 2, 2011
By
Merging Two Different Datasets Containing a Common Column With R and R-Studio

Another way for the database challenged (such as myself!) for merging two datasets that share at least one common column… This recipe using the cross-platform stats analysis package, R. I use R via the R-Studio client, which provides an IDE wrapper around the R environment. So for example, here’s how to merge a couple of

Read more »