jsonlite 0.9.22: distinguish between double and integer

[This article was first published on OpenCPU, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

opencpu logo

Today a new version of the jsonlite package was released to CRAN. This update includes a few internal enhancements and one new feature.

Doubles vs integers

The new always_decimal parameter forces formatting of doubles in decimal notation. That is to include at least one digit right of the decimal dot. This allows us to distingish them from integers, if you need this.

<span class="n">x</span> <span class="o"><-</span> <span class="m">1</span><span class="o">:</span><span class="m">5</span>
<span class="n">y</span> <span class="o"><-</span> <span class="n">as.numeric</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
<span class="p">(</span><span class="n">json_x</span> <span class="o"><-</span> <span class="n">jsonlite</span><span class="o">::</span><span class="n">toJSON</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">always_decimal</span> <span class="o">=</span> <span class="n">TRUE</span><span class="p">))</span>
<span class="c1"># [1,2,3,4,5] 
</span>
<span class="p">(</span><span class="n">json_y</span> <span class="o"><-</span> <span class="n">jsonlite</span><span class="o">::</span><span class="n">toJSON</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">always_decimal</span> <span class="o">=</span> <span class="n">TRUE</span><span class="p">))</span>
<span class="c1"># [1.0,2.0,3.0,4.0,5.0] 
</span>

By formatting doubles this way they naturally get parsed back into doubles. So we can roundtrip numbers between R and json without losing type:

<span class="n">identical</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">jsonlite</span><span class="o">::</span><span class="n">fromJSON</span><span class="p">(</span><span class="n">json_x</span><span class="p">))</span>
<span class="c1"># TRUE
</span>
<span class="n">identical</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">jsonlite</span><span class="o">::</span><span class="n">fromJSON</span><span class="p">(</span><span class="n">json_y</span><span class="p">))</span>
<span class="c1"># TRUE
</span>

You should only use this if you really need it. The json format itself does not specify number types, hence there is no guarantee that an arbitrary json parser will distinguish between integers and doubles. Indeed, most json parsers might simply parse any number into a double, which is totally correct as well.

Also setting always_decimal = TRUE introduces some performance overhead.

Numbers in MongoDB and Mongolite

The main motivation for this feature was to insert data from R into MongoDB using the mongolite package. Several users of mongolite had requested that it would be nice to retain number types, especially when reading the data from MongoDB back into a strong typed language such as C++.

The latest version of mongolite automatically takes advantage of this feature:

<span class="c1"># Get latest mongolite
</span><span class="n">devtools</span><span class="o">::</span><span class="n">install_github</span><span class="p">(</span><span class="s2">"jeroenooms/mongolite"</span><span class="p">)</span>

<span class="c1"># Assuming you have a local `mongod` running
</span><span class="n">library</span><span class="p">(</span><span class="n">mongolite</span><span class="p">)</span>
<span class="n">df</span> <span class="o"><-</span> <span class="n">data.frame</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="m">1</span><span class="o">:</span><span class="m">5</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">as.numeric</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">5</span><span class="p">))</span>
<span class="n">m</span> <span class="o"><-</span> <span class="n">mongo</span><span class="p">(</span><span class="s2">"testnum"</span><span class="p">)</span>
<span class="n">m</span><span class="o">$</span><span class="n">insert</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
<span class="n">out</span> <span class="o"><-</span> <span class="n">m</span><span class="o">$</span><span class="n">find</span><span class="p">()</span>
<span class="n">identical</span><span class="p">(</span><span class="n">out</span><span class="p">,</span> <span class="n">df</span><span class="p">)</span>
<span class="c1"># TRUE
</span>

This makes it even more seamless to use MongoDB as a backend for storing data frames in R!

To leave a comment for the author, please follow the link and comment on their blog: OpenCPU.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)