# CouchDB and R

October 2, 2010
By

Here are some quick crib notes on getting R talking to CouchDB using Couch's ReSTful HTTP API. We'll do it in two different ways. First, we'll construct HTTP calls with RCurl, then move on to the R4CouchDB package for a higher level interface. I'll assume you've already gotten started with CouchDB and are familiar with the basic ReST actions: GET PUT POST and DELETE.

First install RCurl and RJSONIO. You'll have to download the tar.gz's if you're on a Mac. For the second part, we'll need to install R4CouchDB, which depends on the previous two. I checked it out from GitHub and used R CMD INSTALL.

### ReST with RCurl

#### Ping server

getURL("http://localhost:5984/")
[1] "{\"couchdb\":\"Welcome\",\"version\":\"1.0.1\"}\n"


That's nice, but we want to get the result back as a real R data structure. Try this:

welcome <- fromJSON(getURL("http://localhost:5984/"))
)
reader$value()  #### GET Now that there's something in there, how do we get it back? That's super easy. bozo2 <- fromJSON(getURL("http://localhost:5984/testing123/bozo")) bozo2$_id
[1] "bozo"

$_rev [1] "1-646331b58ee010e8df39b5874b196c02"$name
[1] "Bozo"

$occupation [1] "clown"$shoe.size
[1] 100


#### PUT again for updating

Updating is done by using PUT on an existing document. For example, let's give Bozo, some mad skillz:

getURL(
"http://localhost:5984/testing123/bozo",
customrequest="PUT",
postfields=toJSON(bozo2))


#### POST

If you POST to the database, you're adding a document and letting CouchDB assign its _id field.

bender = list(
name='Bender',
occupation='bending',
species='robot')
response <- fromJSON(getURL(
'http://localhost:5984/testing123/',
customrequest='POST',
postfields=toJSON(bender)))
response
$ok [1] TRUE$id
[1] "2700b1428455d2d822f855e5fc0013fb"

$rev [1] "1-d6ab7a690acd3204e0839e1aac01ec7a"  #### DELETE For DELETE, you pass the doc's revision number in the query string. Sorry, Bender. response <- fromJSON(getURL("http://localhost:5984/testing123/2700b1428455d2d822f855e5fc0013fb?rev=1-d6ab7a690acd3204e0839e1aac01ec7a", customrequest="DELETE"))  ### CRUD with R4CouchDB R4CouchDB provides a layer on top of the techniques we've just described. R4CouchDB uses a slightly strange idiom. You pass a cdb object, really just a list of parameters, into every R4CouchDB call and every call returns that object again, maybe modified. Results are returned in cdb$res. Maybe, they did this because R uses pass by value. Here's how you would initialize the object.

cdb <- cdbIni()
cdb$serverName <- "localhost" cdb$port <- 5984
cdb$DBName="testing123"  #### Create fake.data <- list( state='WA', population=6664195, state.bird='Lady GaGa') cdb$dataList <- fake.data
cdb$id <- 'fake.data' ## optional, otherwise an ID is generated cdb <- cdbAddDoc(cdb) cdb$res
$ok [1] TRUE$id
[1] "fake.data"

$rev [1] "1-14bc025a194e310e79ac20127507185f"  #### Read cdb$id <- 'bozo'
cdb <- cdbGetDoc(cdb)

bozo <- cdb$res bozo$_id
[1] "bozo"
... etc.


#### Update

First we take the document id and rev from the existing document. Then, save our revised document back to the DB.

cdb$id <- bozo$_id
cdb$rev <- bozo$_rev
bozo = list(
name="Bozo",
occupation="assassin",
shoe.size=100,
skills=c(
'pranks',
'honking nose',
'kung fu',
'high explosives',
'sniper',
'lock picking',
'safe cracking'))
cdb <- cdbUpdateDoc(bozo)


#### Delete

Shortly thereafter, Bozo mysteriously disappeared.

cdb$id = bozo$_id
cdb <- cdbDeleteDoc(cdb)


### More on ReST and CouchDB

• One issue you'll probably run into is that unfortunately JSON left out NaN and Infinity. And, of course only R knows about NAs.
• One-off ReST calls are easy using curl from the command line, as described in REST-esting with cURL.
• I flailed about quite a bit trying to figure out the best way to do HTTP with R.
• I originally thought R4CouchDB was part of a Google summer of code project to support NoSQL DBs in R. Dirk Eddelbuettel clarified that R4CouchDB was developed independently. In any case, the schema-less approach fits nicely with R's philosophy of exploratory data analysis.