# Announcing rquery

**R – Win-Vector Blog**

We are excited to announce the `rquery`

`R`

package.

`rquery`

is Win-Vector LLC‘s currently in development big data query tool for `R`

.

`rquery`

supplies set of operators inspired by Edgar F. Codd‘s relational algebra (updated to reflect lessons learned from working with `R`

, `SQL`

, and `dplyr`

at big data scale in production).

As an example: `rquery`

operators allow us to write our earlier “treatment and control” example as follows.

dQ <- d %.>% extend_se(., if_else_block( testexpr = "rand()>=0.5", thenexprs = qae( a_1 := 'treatment', a_2 := 'control'), elseexprs = qae( a_1 := 'control', a_2 := 'treatment'))) %.>% select_columns(., c("rowNum", "a_1", "a_2"))

`rquery`

pipelines are first-class objects; so we can extend them, save them, and even print them.

cat(format(dQ)) table('d') %.>% extend(., ifebtest_1 := rand() >= 0.5) %.>% extend(., a_1 := ifelse(ifebtest_1,"treatment",a_1), a_2 := ifelse(ifebtest_1,"control",a_2)) %.>% extend(., a_1 := ifelse(!( ifebtest_1 ),"control",a_1), a_2 := ifelse(!( ifebtest_1 ),"treatment",a_2)) %.>% select_columns(., rowNum, a_1, a_2)

`rquery`

targets only databases, and right now primarilly `SparkSQL`

and `PostgreSQL`

. `rquery`

is primarily a `SQL`

generator, allowing it to avoid some of the trade-offs required to directly support in-memory `data.frame`

s. We demonstrate converting the above `rquery`

pipeline into `SQL`

and executing it here.

`rquery`

itself is still in early development (and not yet ready for extensive use in production), but it is maturing fast, and we expect more `rquery`

announcements going forward. Our current intent is to bring in sponsors, partners, and `R`

community voices to help develop and steer `rquery`

.

**R – Win-Vector Blog**.

