AirbnB uses R to scale data science

April 5, 2016

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Airbnb, Airbnb-logothe property-rental marketplace that helps you find a place to stay when you're travelling, uses R to scale data science. Airbnb is a famously data-driven company, and has recently gone through a period of rapid growth. To accommodate the influx of data scientists (80% of whom are proficient in R, and 64% use R as their primary data analysis language), Airbnb organizes monthly week-long data bootcamps for new hires and current team members.

But just as important as the training program is the engineering process Airbnb uses to scale data science with R. Rather than just have data scientists write R functions independently (which not only is a likely duplication of work, but inhibits transparency and slows down productivity), Airbnb has invested in building an internal R package called Rbnb that implements collaborative solutions to common problems, standardizes visual presentations, and avoids reinventing the wheel. (Incidentally, the development and use of internal R packages is a common pattern I've seen at many companies with large data science teams.)

The Rbnb package used at Airbnb includes more than 60 functions and is still growing under the guidance of several active developers. It's actively used by Airbnb's engineering, data science, analytics and user experience teams, to do things like move aggregated or filtered data from a Hadoop or SQL environment into R, impute missing values, compute year-over-year trends, and perform common data aggregations. It has been used to create more than 500 research reports and to solve problems like automating the detection of host preferences and using guest ratings to predict rebooking rates.


The package is also widely used to visualize data using a standard Airbnb "look". The package includes custom themes, scales, and geoms for ggplot2; CSS templates for htmlwidgets and Shiny; and custom R Markdown templates for different types of reports. You can see several examples in the blog post by Ricardo Bion linked below, including this gorgeous visualization of the 500,000 top Airbnb trips.


Medium (AirbnbEng): Using R packages and education to scale Data Science at Airbnb

To leave a comment for the author, please follow the link and comment on their blog: Revolutions. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)