The jsonlite package implements a robust, high performance JSON parser and generator for R, optimized for statistical data and the web. This week version 0.9.13 appeared on CRAN which is the third release in a relatively short period focusing on performance optimization.
Fast number formatting
Version 0.9.11 and 0.9.12 had already introduced majors speedup by porting critical bottlenecks to C code and switching to a better JSON parser. The current release focuses on number formatting and incorporates C code from
modp_numtoa which is several times faster than
sprintf for converting doubles and integers to strings (your mileage may vary depending on platform and precision).
library(ggplot2) nrow(diamonds) #  53940 system.time(jsonlite::toJSON(diamonds, dataframe = "row")) # user system elapsed # 0.319 0.007 0.325 system.time(jsonlite::toJSON(diamonds, dataframe = "col")) # user system elapsed # 0.073 0.002 0.075
Using the same benchmark from previous posts, time to convert the
diamonds data to row-based json has gone down from 0.619s to 0.325s on my machine (about 2x speedup from jsonlite 0.9.12), and converting to column-based json has gone down from 0.330s to 0.075s (about 4x speedup).
Comparing to other JSON packages
When comparing JSON packages, it should be noted that the comparsion is never entirely fair because different packages use different settings and defaults for missing values, number of digits, etc. Both
RJSONIO only support the column based format for encoding data frames. Using their default settings:
system.time(rjson::toJSON(diamonds)) # user system elapsed # 0.279 0.004 0.281 system.time(RJSONIO::toJSON(diamonds)) # user system elapsed # 0.918 0.027 0.944
For this particular dataset, jsonlite is about 3.5x faster than
rjson and about 12x faster than
RJSONIO (on my machine) to generate column-based JSON. These differences are relatively large because 7 out of the 10 columns in the
diamonds dataset are numeric.