Calculating hotel occupancy with R

[This article was first published on Data by John, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I saw a question on reddit today which can be answered using {patientcounter}

The asker wants to know how to work out how many folk are in a hotel, at any given time, based on check in and check out dates.

Here’s the setup:

check_in_date <- c('2010-01-01', '2010-01-02' ,'2010-01-01', 
                   '2010-01-08', '2010-01-08', '2010-01-15', 
                   '2010-01-15', '2010-01-16', '2010-01-19', '2010-01-22') 

check_out_date <- c('2010-01-07', '2010-01-04' ,'2010-01-09', 
                    '2010-01-21', '2010-01-11', '2010-01-22',
                    NA, '2010-01-20', '2010-01-25', '2010-01-29') 

#NB - the original question has a string instead of the NA value. 
# patientcounter can handle the NA, but not the string. 

Person = c("John", "Smith", "Alex", "Peter", "Will", "Matt", "Tim", "Kevin", "Tom", "Adam")

Make the dataframe:

checkin <- as.POSIXct(as.Date(check_in_date)) # make it a datetime
checkout <- as.POSIXct(as.Date(check_out_date))  # make it a datetime
hotel <- data.frame(checkin, checkout, Person) 
hotel

##       checkin   checkout Person
## 1  2010-01-01 2010-01-07   John
## 2  2010-01-02 2010-01-04  Smith
## 3  2010-01-01 2010-01-09   Alex
## 4  2010-01-08 2010-01-21  Peter
## 5  2010-01-08 2010-01-11   Will
## 6  2010-01-15 2010-01-22   Matt
## 7  2010-01-15       <NA>    Tim
## 8  2010-01-16 2010-01-20  Kevin
## 9  2010-01-19 2010-01-25    Tom
## 10 2010-01-22 2010-01-29   Adam

Now for patientcounter to do it’s thing…it’s designed for this exact task - getting a census, or count, of people / goods/ items in or at a place at a point in time.

The interval_census function does the work:

#remotes::install_github("johnmackintosh/patientcounter")
library(patientcounter)
# calculate the number of occupants per day
occupancy <- interval_census(hotel,
                identifier = "Person",
                admit = "checkin",
                discharge  = "checkout", 
                time_unit = '1 day',
                results = 'total')

# grab the date and count columns:
occupancy[,.(base_date, N)]

##      base_date N
##  1: 2010-01-01 2
##  2: 2010-01-02 3
##  3: 2010-01-03 3
##  4: 2010-01-04 2
##  5: 2010-01-05 2
##  6: 2010-01-06 2
##  7: 2010-01-07 1
##  8: 2010-01-08 3
##  9: 2010-01-09 2
## 10: 2010-01-10 2
## 11: 2010-01-11 1
## 12: 2010-01-12 1
## 13: 2010-01-13 1
## 14: 2010-01-14 1
## 15: 2010-01-15 3
## 16: 2010-01-16 4
## 17: 2010-01-17 4
## 18: 2010-01-18 4
## 19: 2010-01-19 5
## 20: 2010-01-20 4
## 21: 2010-01-21 3
## 22: 2010-01-22 3
## 23: 2010-01-23 3
## 24: 2010-01-24 3
## 25: 2010-01-25 2
## 26: 2010-01-26 2
## 27: 2010-01-27 2
## 28: 2010-01-28 2
##      base_date N

Job Done 🙂

There were several answers, and maybe a package is overkill, but this function is fast and scales well. It’s also pretty flexible and has applications outside of healthcare.

To leave a comment for the author, please follow the link and comment on their blog: Data by John.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)