[This article was first published on Jun Ma - Data Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Part 1 demonstrates how to grab the location data of the cars (and other status, like fuel, cleaness etc.) at a given time from Car2go API. However, in order to show how did the car moved over an extended period, we need to do a timed loop to grab the data and build analysis on top of that. Right is a plot showing the car routes during 24hrs in Austin TX.
Identify moved cars
First, we need to identify whether the car has moved or not. To do this, I removed duplicated rows based on location and only keep those whose location info has changed. Next, I get the rows represent the car first and last shown at that location (this is to get the time stamp of a trip: last shown at location is the start of a trip and first shown at another location if the end of a trip).
Find and accurately plot the route
Then I use a for loop to loop through different cars (that have moved) and for each car, I loop through different trips (most likely, a car will have multiple trips). For each trip, I can get the route info from goole map api (library(ggmap) does it). It also gives the distance between each turn. The route info is an approximation to the actual movement of a car. Car2go doesn’t supply realtime GPS info when the car is moving, it only records when a customer checks out a trip. However, I think the approximated route is close enough to a real life scenario, which assumes most car2go customers use the service for transportation purpose, other than leisure and recreation activities.
Left: route plotted from turn-by-turn instrctions. Right: same route plotted from detailed polyline from google map api.
Then I was able to plot the route for each car using the route data return by google maps. The result is shown on the left below. While it roughly represents the route on the map, it fails for curvy roads with less turns. The reason is that route(output = ‘simple’) only gives instruction for each turn, and between each turn, geom_path uses a straight line.
In order to solve this problem, I found this article, which converts polyline from goolge map api with route(output = ‘all’) and outputs (lon, lat) coordinates. Now the path represents the actual route on the road, as shown on the right plot above.
All trips during a 24hr span
Next, I plot all the trips happened over Dec 07 13:40 – Dec 08 13:36 (Monday – Tuesday). The covered area is, as expected, similar to the service area of Car2go Austin.
Car2go trips in Austin, TX in 24hrs
Suprisingly, no trips took place in UT campus during this period (except very few in north campus). There could be several reasons: 1: limited parking space, 2: students are studying at home for final exams rather than taking classes, so there is significant less population, 2: Car2go is less popular than public transport for students. The actual reason is unknow from this set of data. More data (taken during normal semester time, during weekend when more parking is avalable, etc.) is needed.
Update: I just found UT campus is a stop-over area only, therefore, it is not suprising at all.
Next let’s take a look at trip statistics.
Most trips are less than 5 miles and 50 minutes. Note there are a significant amount of less than 1-mile trips. While some of them are actual trips by customers, the rest could be noise in the data or moveover by Car2go.
Now, we can take a look at the starting time of a trip during a day.
As shown in the above plot, most trips are for commute (~8am and ~6pm) and very few trips took place during midnight.
To leave a comment for the author, please follow the link and comment on their blog: Jun Ma - Data Blog.