(This article was first published on - R, and kindly contributed to R-bloggers)
It’s no secret that phone manufacturers log your geolocation at all times, but at least they’re open about it these days. I had no doubts that my Google Nexus 5 phone logged every move I made, and I was interested to see how I could map these geolocations.
If you’re using Android, you can download your location history from your gmail account. Once in the location history menu, you can select and view at most 30 days worth of history. This is slightly annoying if you want an oversight of, say, the past six months.
In order to work around this limitation, you can simply download the data for each month. The data is downloaded as a ‘KML’ file. Chances are that you (like me) are not that familiar with KML files. However, a quick look suggests that it is some form of XML file, which means that we can use BeautifulSoup to extract the data. You can download the script I used for this process here.
Once you have extracted the data from the KML files, you are ready to load the data into R:
First, I would like to see roughly how many locations are logged daily on my phone. The lubridate package allows us to easily convert raw dates into more workable formats. We then use dplyr to group and summarize the data by date. Essentially, this simply leaves us with counts of logged locations for each day.1 We can then plot these counts using ggplot2.
In the image below, you see the number of logged locations (y-axis) for each given day (x-axis). On average, my phone logs 556 locations per day. The exception here being April 6th, where I may have switched off my mobile data service.
We’re going to use ggplot2 and ggmaps (part of the ggplot2 package) to plot our data. First, we fetch a static google map via the ggmap package. Then, we download a shapefile with administrative regions, and transform the coordinate system to WGS84 such that we can put the layer on top of the google map. In order for ggmap to be able to ‘understand’ the geolocations, we have to use a function called ‘fortify’. After that, we’re all ready to go!
The image below basically plots the latitude against the longitude values, and adds the google map, shapefile with administrative regions (blue) and my geolocations (red). I mostly travel by train, so you can actually make out the train tracks on this image.
What if we want to take a closer look at, say, locations in a specific municipality? Because ggmap works well with Google maps, you can basically specify any location name the same way you would on Google maps. In the example below, we’re zooming in and plotting locations in Amsterdam only. Unfortunately, we cannot easily download shapefiles for administrative regions on the municipality level through R. However, the package rgdal allows us to import shapefiles downloaded from the internet and luckily, there are many places where we can download such shapefiles, and a quick Google search would give you even more.
This is one of the many ways we can plot geospatial data in R. Although you could go even further and analyze the places you visit most often, it’s mainly just a fun exercise. If you would like more information about plotting geospatial data in R, I would recommend going through this short paper co-authored by Hadley Wickam, the developer of ggplot2 and dplyr.
Part 2 will focus on creating more appealing visuals.
I’ll be using the chaining (‘%>%’) function from the dplyr package. This way, we can get to the summarized results in one go. If you’re not familiar with dplyr, check out its vignette on CRAN. It is definitely worth it. ↩
To leave a comment for the author, please follow the link and comment on their blog: - R.