clientsdb – A docker image with clients comments

[This article was first published on Colin Fay, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Have you ever been looking for a ready to use database while doing training? Search no more, this docker is an image with a client review database dropped inside a postgre, to be used for teaching.

About the dataset

The titles and comments are extracted from this Google Drive link that contains “amazon_review_full_csv.tar.gz”, which I discovered on the Amazon review database Kaggle page. Then the two columns date and name being randomly generated in R.

Here is the coded used to generate the full table:

dataset <- fread("data/train.csv", header = FALSE, sep = ",")
names(dataset) <- c("score", "title", "comment")

nms <- paste(
  sample(charlatan:::person_en_us$first_names, nrow(dataset), TRUE), 
  sample(charlatan:::person_en_us$last_names, nrow(dataset), TRUE)

date <- sample(0:as.numeric(as.POSIXct("2010-01-01")), nrow(dataset), TRUE)
date <- as.POSIXct(date, origin = "1970-01-01")

  , `:=`(
    score = NULL,
    name = nms, 
    date = date

data.table::fwrite(dataset, "datasetwithusers.csv")

Launch and use

The main purpose of this image is to provide a “real life” tool for teaching databases use.

Info: - the POSTGRES_DB used is clients - the POSTGRES_PASSWORD is verysecretwow - the POSTGRES_USER is superduperuser

To launch the db, do:

# Might take some time to warm up
docker run --rm -d -p 5432:5432 --name clientsdb colinfay/clientsdb:latest

Then, for example from R:


con <- dbConnect(
  dbname = 'clients', 
  host = 'localhost',
  port = 5432, 
  user = 'superduperuser',
  password = 'verysecretwow'


[1] "clients"

res <- dbSendQuery(con, "SELECT score, title, name, date FROM clients LIMIT 5")

  score                               title             name       date
1     5                      Fantastic Item   Karolyn Wunsch 2003-01-02
2     4            Easy to Use, Great Value  Wayland Langosh 2008-06-06
3     4                      It works well!  Fed Oberbrunner 1989-07-22
4     3                        Meets a need        Lora Yost 1976-04-13
5     4 Ion TTUSB Turntable with USB Record Valentina Harvey 1986-01-28


res <- dbSendQuery(con, "SELECT title, name, date FROM clients WHERE date = '1998-05-12' LIMIT 10")

                                title             name       date
1                      underrated art  Burgess Kuhlman 1998-05-12
2                      Great product!   Madelyn Bailey 1998-05-12
3         Disappointing, not correct.      Scott Walsh 1998-05-12
4             The turtle says it all! Magdalen Strosin 1998-05-12
5            I took the TEAS and NLN.    Rosie Bradtke 1998-05-12
6                           Terrible.   Lita Marquardt 1998-05-12
7               Harlan County History   Clemon Effertz 1998-05-12
8  A flawed book, but a good subject.  Amey Rutherford 1998-05-12
9                   Worth every penny   Debrah Keebler 1998-05-12
10             A light enjoyable read Dwaine Schneider 1998-05-12



And then stop the db.

docker stop clientsdb 

To leave a comment for the author, please follow the link and comment on their blog: Colin Fay. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)