Show me the data! Or how to digitize plots

February 27, 2012

(This article was first published on mages' blog, and kindly contributed to R-bloggers)

I had mentioned the Guardian’s data blog and the need for more data journalism earlier here. What I really like about the Guardian’s approach in particular is that they share the data of their articles and encourage readers to use it.

Of course there are perfectly valuable reasons for only displaying a chart and not making the underlying data available, e.g. to generate leads, as potential customers may get in touch with you asking for the underlying data, or technology issues that don’t allow you to upload data, etc.

I personally believe that when I show a chart I should also make the underlying data available. Pretty pictures give you the attention, but the underlying data will offer you an opportunity to engage with your reader on a different level. This might be similar to open source software. In most cases users don’t want to see and read the code, but having the knowledge that they could provides more credibility.

Screen shot of plot digitizer using Guy Carpenter’s
global property catastrophe rate on line index

Here is another reason why I should make the data available: Because it is easy to extract the data from a chart anyhow, thanks to digitizing software like the Java application plot digitizer. While in the past I may have used graph paper and a ruler, nowadays it only takes a few minutes to extract the information.

To leave a comment for the author, please follow the link and comment on their blog: mages' blog. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , ,

Comments are closed.


Mango solutions

plotly webpage

dominolab webpage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training




CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)