Statistics Sunday: What Should I Read Next?

September 16, 2018
By

(This article was first published on Deeply Trivial, and kindly contributed to R-bloggers)

When You Need a New Book to Read I log all of my books on Goodreads. On top of that, whenever I hear about a new book I have to read, I add it on Goodreads, so I remember it. Of course, this means my Goodreads bookshelves are a little out of control. Fortunately, I can use R to dig through my Goodreads to-read shelf and figure out the next book to read and/or buy.

If you’re on Goodreads, you can easily download your entire bookshelf, including your to-read books, by going to “My Books” then clicking “Import and Export”. On the right side of the screen will be a link for “Export Library”. Click that and give it a minute (or several). Soon, a link will appear to download your entire library in a CSV file. You can then bring that into R.

If I’m ever stuck for the next book to read, I can use this file to randomly select a book from my to-read list to check out next. Because I own a lot of books on my to-read list, I’d like to filter that dataset to only include books I own. (Note: You can add books to your “owned” list by clicking on “My Books” then “Owned Books” to select which books you already have in your library. Otherwise, you can keep running the sample function until you get a book you already own or have ready access to. You’d just want to skip the “reading_list” part below.)

setwd("~/Dropbox")
library(tidyverse)
books <- read_csv("goodreads_library_export.csv", col_names = TRUE)
reading_list <- books %>%
filter(`Owned Copies` == 1, `Exclusive Shelf` == "to-read")
head(reading_list)
## # A tibble: 6 x 31
## `Book Id` Title Author `Author l-f` `Additional Auth… ISBN ISBN13
##
## 1 27877138 It Stephe… King, Steph… 1501… 9.78e12
## 2 10611 The Eye… Stephe… King, Steph… 0751… 9.78e12
## 3 11570 Dreamca… Stephe… King, Steph… William Olivier … 2226… 9.78e12
## 4 36452674 The Squ… Kevin … Hearne, Kev… NA
## 5 38193271 Bickeri… Mildre… Abbott, Mil… NA
## 6 20873740 Sapiens… Yuval … Harari, Yuv… NA
## # ... with 24 more variables: `My Rating` , `Average Rating` ,
## # Publisher , Binding , `Number of Pages` , `Year
## # Published` , `Original Publication Year` , `Date
## # Read` , `Date Added` , Bookshelves , `Bookshelves
## # with positions` , `Exclusive Shelf` , `My Review` ,
## # Spoiler , `Private Notes` , `Read Count` , `Recommended
## # For` , `Recommended By` , `Owned Copies` , `Original
## # Purchase Date` , `Original Purchase Location` ,
## # Condition , `Condition Description` , BCID

Now I have a data frame of books that I own and have not read. This data frame contains 55 books. Drawing a random sample of 1 book is quite easy.

reading_list[sample(1:nrow(reading_list), 1),]
## # A tibble: 1 x 31
## `Book Id` Title Author `Author l-f` `Additional Auth… ISBN ISBN13
##
## 1 14201 Jonatha… Susann… Clarke, Susa… 0765… 9.78e12
## # ... with 24 more variables: `My Rating` , `Average Rating` ,
## # Publisher , Binding , `Number of Pages` , `Year
## # Published` , `Original Publication Year` , `Date
## # Read` , `Date Added` , Bookshelves , `Bookshelves
## # with positions` , `Exclusive Shelf` , `My Review` ,
## # Spoiler , `Private Notes` , `Read Count` , `Recommended
## # For` , `Recommended By` , `Owned Copies` , `Original
## # Purchase Date` , `Original Purchase Location` ,
## # Condition , `Condition Description` , BCID

According to this random sample, the next book I should read is Jonathan Strange & Mr Norrell. Now if I’m ever stuck for a book to read, I can use this code to find one. And if I’m in a bookstore, picking up something new – as is often the case, since bookstores are one of my happy places – I can update the code to tell me which book I should buy next.

to_buy <- books %>%
filter(`Owned Copies` == 0, `Exclusive Shelf` == "to-read")
to_buy[sample(1:nrow(to_buy), 1),]
## # A tibble: 1 x 31
## `Book Id` Title Author `Author l-f` `Additional Aut… ISBN ISBN13
##
## 1 2906039 Just Af… Stephen… King, Steph… 1416… 9.78e12
## # ... with 24 more variables: `My Rating` , `Average Rating` ,
## # Publisher , Binding , `Number of Pages` , `Year
## # Published` , `Original Publication Year` , `Date
## # Read` , `Date Added` , Bookshelves , `Bookshelves
## # with positions` , `Exclusive Shelf` , `My Review` ,
## # Spoiler , `Private Notes` , `Read Count` , `Recommended
## # For` , `Recommended By` , `Owned Copies` , `Original
## # Purchase Date` , `Original Purchase Location` ,
## # Condition , `Condition Description` , BCID

So next time I’m at a bookstore, which will be tomorrow (since I’ll be hanging out in Evanston for a class at my dance studio and plan to hit up the local Barnes & Noble), I should pick up a copy of Just After Sunset.

If you’re on Goodreads, feel free to add me!

To leave a comment for the author, please follow the link and comment on their blog: Deeply Trivial.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)