Where are the cars?
After I published an article on how to scrape advanced shooting data from stat.nba.com, a friend of mine contact me to see whether I may be able to scrape some data from zipcar or car2go’s API. So I looked into it and found it is quite straight forward to do. Here is an example to scrape data from car2go.Car2go is a popular car sharing program in North America and Europe. Here is a little introduction from Wiki if you haven’t about it: The company offers exclusively Smart Fortwo vehicles and features one-way point-to-point rentals. Users are charged by the minute, with hourly and daily rates available. As of May 2015, car2go is the largest carsharing company in the world with over 1,000,000 members.
A typical URL looks like this. Anything after json is not needed. After putting it into R (using jsonlite), you will see a list of 1 data frame. However, the column coordinates need special attention as it is also a list. Therefore it is treated differently as other columns, and then they are then ‘cbind’ed.
Also note cities with EVs have an attribute of ‘charging’ while others don’t, in order to have a consistent sata structure, I first identify those cities based on length() of the data. Then incert a charging column into those that don’t have and assign a value of NA.
A for loop can be used to scrape data from different cities, an if statement will automatically identify which data processing code to use. Finally, all the data can be ‘rbind’ed and output a csv.
It is also quite easy to display all the cars on a map at a give time. Below is one example. I used the ggmap library. Color of each vehicle is to show how much fuel left. The result is the plot at the beginning of the blog.
How about the average cleanness of cars in each city? or fuel level?
For limited sample size, it seems German cities have the highest car cleanness, Italian cites the lowest, while US cities are in the middle.
The entire code is published here if you are interested. Enjoy!
To leave a comment
for the author, please follow the link and comment on their blog: Jun Ma - Data Blog
offers daily e-mail updates
news and tutorials
on topics such as: Data science
, Big Data, R jobs
, visualization (ggplot2
), programming (RStudio
, Web Scraping
) statistics (regression
, time series
) and more...
If you got this far, why not subscribe for updates
from the site? Choose your flavor: e-mail
, or facebook