Blog Archives

Negative Payments in Local Spending Data

August 17, 2013
By
Negative Payments in Local Spending Data

In anticipation of a new R library from School of Data data diva @mihi_tr that will wrap the OpenSpending API and providing access to OpenSpending.org data directly from within R, I thought I’d start doodling around some ideas raised in Identifying Pieces in the Spending Data Jigsaw. In particular, common payment values, repayments/refunds and “balanced

Read more »

Generating Sankey Diagrams from rCharts

July 23, 2013
By
Generating Sankey Diagrams from rCharts

A couple of weeks or so ago, I picked up an inlink from an OCLC blog post about Visualizing Network Flows: Library Inter-lending. The post made use of Sankey diagrams to represent borrowing flows, and by implication suggested that the creation of such diagrams is not as easy as it could be… Around the same

Read more »

Generating Alerts From Guardian University Tables Data

June 23, 2013
By
Generating Alerts From Guardian University Tables Data

One of the things I’ve been pondering with respect to the whole data journalism process is how journalists without a lot of statistical training can quickly get a feel for whether there may be interesting story leads in a dataset, or how they might be able to fashion “alerts” that bring attention to data elements

Read more »

Disposable Visual Data Explorers with Shiny – Guardian University Tables 2014

June 21, 2013
By
Disposable Visual Data Explorers with Shiny – Guardian University Tables 2014

Have data – now what? Building your own interactive data explorer need not be a chore with the R shiny library… Here’s a quick walkthrough… In Datagrabbing Commonly Formatted Sheets from a Google Spreadsheet – Guardian 2014 University Guide Data, I showed how to grab some data from several dozen commonly formatted sheets in a

Read more »

Datagrabbing Commonly Formatted Sheets from a Google Spreadsheet – Guardian 2014 University Guide Data

June 20, 2013
By
Datagrabbing Commonly Formatted Sheets from a Google Spreadsheet – Guardian 2014 University Guide Data

So it seems like it’s that time of year when the Guardian publish their university rankings data (Datablog: University guide 2014), which means another opportunity to have a tinker and see what I’ve learned since last year… (Last year’s hack was a Filtering Guardian University Data Every Which Way You Can…, where I had a

Read more »

Evaluating Event Impact Through Social Media Follower Histories, With Possible Relevance to cMOOC Learning Analytics

April 21, 2013
By
Evaluating Event Impact Through Social Media Follower Histories, With Possible Relevance to cMOOC Learning Analytics

Last year I sat on a couple of panels organised by I’m a Scientist’s Shane McCracken at various science communication conferences. A couple of days ago, I noticed Shane had popped up a post asking Who are you Twitter?, a quick review of a social media mapping exercise carried out on the followers of the

Read more »

Estimated Follower Accession Charts for Twitter

April 5, 2013
By
Estimated Follower Accession Charts for Twitter

Just over a year or so ago, Mat Morrison/@mediaczar introduced me to a visualisation he’d been working on (How should Page Admins deal with Flame Wars?) that I started to refer to as an accession chart (Visualising Activity Around a Twitter Hashtag or Search Term Using R). The idea is that we provide each entrant

Read more »

Splitting a Large CSV File into Separate Smaller Files Based on Values Within a Specific Column

April 3, 2013
By
Splitting a Large CSV File into Separate Smaller Files Based on Values Within a Specific Column

One of the problems with working with data files containing tens of thousands (or more) rows is that they can become unwieldy, if not impossible, to use with “everyday” desktop tools. When I was Revisiting MPs’ Expenses, the expenses data I downloaded from IPSA (the Independent Parliamentary Standards Authority) came in one large CSV file

Read more »

Revisiting MPs’ Expenses

April 2, 2013
By
Revisiting MPs’ Expenses

I couldn’t but notice the chatter about Iain Duncan Smith claiming he’d have no problem “living on 53 pounds a dayweek“, which made me wonder not only how many meal catered events he attends each week (and how many of his scheduled meeting also have complementary tea and biscuits (a bellweather of the extent of

Read more »

Publishing Stats for Analytic Reuse – FAOStat Website and R Package

March 8, 2013
By
Publishing Stats for Analytic Reuse – FAOStat Website and R Package

How can stats and data publishers, from NGOs and (inter)national statistics agencies to scientific researchers, publish their data in a way that supports its analysis directly, as well as in combination with other datasets? Here’s one approach I learned about from Michael Kao of the UN Food and Agriculture Organisation statistics division, FAOStat. At first

Read more »