Blog Archives

When Documents Become Databases – Tabulizer R Wrapper for Tabula PDF Table Extractor

May 2, 2016
By
When Documents Become Databases – Tabulizer R Wrapper for Tabula PDF Table Extractor

Although not necessarily the best way of publishing data, data tables in PDF documents can often be extracted quite easily, particularly if the tables are regular and the cell contents reasonably space. For example, official timing sheets for F1 races are published by the FIA as event and timing information in a set of PDF

Read more »

First Thoughts on Automatically Generating Accessible Text Descriptions of ggplot Charts in R

April 29, 2016
By
First Thoughts on Automatically Generating Accessible Text Descriptions of ggplot Charts in R

In a course team accessibility briefing last week, Richard Walker briefly mentioned a tool for automatically generating text descriptions of Statistics Canada charts to support accessibility. On further probing, the tool, created by Leo Ferres, turned out to be called iGraph-Lite: … an extensible system that generates natural language descriptions of statistical graphs, particularly those

Read more »

Accessing a Neo4j Graph Database Server from RStudio and Jupyter R Notebooks Using Docker Containers

April 12, 2016
By
Accessing a Neo4j Graph Database Server from RStudio and Jupyter R Notebooks Using Docker Containers

In Getting Started With the Neo4j Graph Database – Linking Neo4j and Jupyter SciPy Docker Containers Using Docker Compose I posted a recipe demonstrating how to link a Jupyter notebook container with a neo4j container to provide a quick way to get up an running with neo4j from a Python environment. It struck me that

Read more »

Visualising F1 Stint Strategies

April 6, 2016
By
Visualising F1 Stint Strategies

With the new F1 season upon us, I’ve started tinkering with bits of code from the Wrangling F1 Data With R book and looking at the data in some new ways. For example, I started wondering whether we might be able to learn something interesting about the race strategies by looking at laptimes on a

Read more »

Another Route to Jupyter Notebooks – Azure Machine Learning

March 31, 2016
By
Another Route to Jupyter Notebooks – Azure Machine Learning

In much the same way that the IBM DataScientist Workbench seeks to provide some level of integration between analysis tools such as Jupyter notebooks and data access and storage, Azure Machine Learning studio also provides a suite of tools for accessing and working with data in one location. Microsoft’s offering is new to me, but

Read more »

New Version of “Wrangling F1 Data With R” Just Released…

February 5, 2016
By
New Version of “Wrangling F1 Data With R” Just Released…

So I finally got round to pushing a revised (and typo corrected!) version of Wrangling F1 Data With R: A Data Junkie’s Guide, that also includes a handful of new section and chapters, including descriptions of how to detect undercuts, the new style race history chart that shows the on-track position of each driver for

Read more »

Using Jupyter Notebooks to Define Literate APIs

February 2, 2016
By
Using Jupyter Notebooks to Define Literate APIs

Part of the vision behind the Jupyter notebook ecosystem seems to be the desire to create a literate computing infrastructure that supports “the weaving of a narrative directly into a live computation, interleaving text with code and results to construct a complete piece that relies equally on the textual explanations and the computational components” (Fernando

Read more »

The Rise of Transparent Data Journalism – The BuzzFeed Tennis Match Fixing Data Analysis Notebook

January 18, 2016
By
The Rise of Transparent Data Journalism – The BuzzFeed Tennis Match Fixing Data Analysis Notebook

The news today was lead in part by a story broken by the BBC and BuzzFeed News – The Tennis Racket – about match fixing in Grand Slam tennis tournaments. (The BBC contribution seems to have been done under the ever listenable File on Four: Tennis: Game, Set and Fix?) One interesting feature of this

Read more »

IBM DataScientistWorkBench = OpenRefine + RStudio + Jupyter Notebooks in the Cloud, Via Your Browser

December 18, 2015
By
IBM DataScientistWorkBench = OpenRefine + RStudio + Jupyter Notebooks in the Cloud, Via Your Browser

One of the many things on my “to do” list is to put together a blogged script that wires together RStudio, Jupyter notebook server, Shiny server, OpenRefine, PostgreSQL and MongDB containers, and perhaps data extraction services like Apache Tika or Tabula and a few OpenRefine style reconciliation services, along with a common shared data container,

Read more »

RStudio Clone for Python – Rodeo

December 17, 2015
By
RStudio Clone for Python – Rodeo

So have you been looking for something like RStudio, but for Python? It’s been out for some time, but a recently updated release of Rodeo gives an increasingly workable RStudio-like environment for Python users. The layout resembles the RStudio layout – file editor top left, interactive console bottom left, variable inspector  and history top right,

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)