Convering plots to data

January 18, 2014
By

(This article was first published on Wiekvoet, and kindly contributed to R-bloggers)

It is a problem which occurs ever so often in applied work, you have a plot, but you want the data. There are at least two programs which can help you there; PlotDigitizer and Engauge Digitizer. I got both on my openSuse machine. Both are available for Windows, for Mac there are only older versions of Engauge.

I tried these programs on a relatively simple problem. I saw a plot in a book and wanted to calculate that line myself. So I took my camera, photographed the plot and got to work.



Engauge Digitizer

Engauge has been there for quite a while. It is many features, but looks a bit outdated. It was not able to import my original figure (2992*2992 pixels, 694 KB) but had no problems after resizing to 500*500 pixels, 55.9 KB.
It is clearly the program which can handle more exotic plots. For me it is not intuitive. For instance, it took me quite some time to figure out how to export the results. Initially I copied-pasted the results to a spreadsheet, later I managed to create a .csv after all. Engauge comes with a manual so everything can be resolved. Engauge has the ability to do point detection, to use that it is probably best to crop the figure as much as possible, Engauge has no qualms finding points in text, black blobs, axis labels and such. Probably in a colored plot automatic detection would work better, you have some settings to guide it.

PlotDigitizer

PlotDigitizer looks much more modern. It had no problems with the large photo, except that it could not scale that photo enough to fit on the screen. The modern interface allows manual adding/removing/moving of points. There is also a possibility to trace a line on screen and it will add points it detects there. PlotDigitizer exports to .xml. It is also possible to cipy-paste the results. While I see the advantage of a file including documentation, it would also be nice to get the data out of the file.

The file I got needed some extra processing before I had the data.frame.
library(XML)
mytree <- xmlTreeParse('test12.xml') 
mylist <- xmlToList(mytree)
mylist2 <- mylist[4:length(mylist)]
mydf <- do.call(rbind,mylist2)
convert <- data.frame(x=as.numeric(mydf[,'dx']),
           y=as.numeric(mydf[,'dy']))

Conclusion

The programs complement each other. Engauge is great for automated extraction, complex plots. However, it is not so easy for occasional usage. PlotDigitizer is easy to use, great if you want to manually select your points.

To leave a comment for the author, please follow the link and comment on his blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.