Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

You may know that MonetDBLite was removed from CRAN.
DuckDB comming up.

# Breaking change

People who based their works on MonetDBLite may ask what happened, what to do. Not to play a risky game with database and tools choices for future works… (“It’s really fast but we may waste some time if we have to replace it by another solution”).

It’s the game with open source. Remember big changes in dplyr 0.7.
Sometimes we want better tools, and most of the time they become better. It’s really great.
And sometimes we don’t have time and energy to adapt our work to tools that became better in a too iterative way. Or in a too subjective way.
We want it to work, not break.
Keeping code as simple as possible (and avoid nebulous dependencies, so, tidy?) is one of the key point.
Stocking data in a database is another one.

All that we can say is that “we’re walking on works in progress”. Like number of eggshells, more works in progress here probably means more breaking changes.

Works in progress for packages, also for (embedded) databases!

# From Monet to Duck

MonetDBLite philosophy is to be like a “very very fast SQLite”. But it’s time for change (or it seems to be).
Then we can thanks MonetDBLite developers as it was a nice adventure to play/work with MonetDB speed!
As a question, is there another person, some volunteers, possibilities to maintain MonetDBLite (somewhere a nice tool)?
There are not so many informations for the moment about what happened and that’s why I write this post.

Here, I read that they are now working on a new solution, under MIT License, named DuckDB, see here for more details.

As I’m just a R user and haven’t collaborate to the project, I would just say for short: DuckDB takes good parts from SQLite and PostGreSQL (Parser), see here for complete list, it looks promising. As in MonetDB, philosophy is focused on columns and speed. And dates for instance are handled correctly, not having to convert them in “ISO-8601 – like” character strings.

It can be called from C/C++, Python and R.

Here is a post about python binding.

I also put a link at the bottom of this page which give some explanations about the name of this new tool and DuckDB developers point’s of view about data manipulation and storage1.

# Beginning with duckDB in R

### Create / connect to the db




### Put some rows and columns in db

Sampling it to get more rows, then duplicating columns, two time.

Write in db

It take some times compared to MonetDBLite (no benchmark here, I just run this several times and it was consistent).

The three are pretty fast.
Most importantly if queries are fast, and they are, most of the time we’re allwright.

I want to say here that’s for now it’s a work in progress, we have to wait more communication from DuckDB developers. I just write this to share the news.

# Dates

Compared to SQLite it handles dates/times correctly. No need to convert in character.

# Some querying

Running some queries

### dplyr

It already works nicely with dplyr.

### sql

Run query as a string

Like for all data sources with DBI, if the query is more complex, we can write it comfortably in an external file and launch it like this for example:

# “Little” benchmarks

### Collecting this big data frame

This has no sense but give some idea of read speed. We collect df_test in memory, from duckdb, monetdb and sqlite.

Really good !

### Simple count

Count then collect aggregate rows.

Faster !

It remains to test joins, filters, sorts, etc.

# Informations

I find that there are not so many communications for the moment about this work and binding for R, so I made this post to highlight it.

MonetDBLite speed is amazing, do you will give DuckDB a try ?

In any case thanks to DuckDB developers and welcome to the new duck.

See here https://github.com/cwida/duckdb.

DuckDB developers point’s of view on data management and explanations about “duck” can be found here.

Here we can read more informations on ALTREP, MonetDBLite and DuckDB, and reasons why MonetDB was finally abandoned (“RIP MonetDBLite”).