Convert IP addresses to geolocation, latitude and longitude etc etc

[This article was first published on Robert Grant's stats blog » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Whoa! Now this is cool. It turns out there is a database at freegeoip.net which you can query for the location of a particular IP address. And as it has a neat little API for batches of IPs, you can get R to fetch them en masse. That’s exactly what Heuristic Andrew has just done with a function that uses the rjson package to pull down data from the JSON version of the database. If you wanted to run a lot of these, you could compile your own local version of the database provided you have a Python interpreter, as described on the GitHub page.

Image

You know, I’m not sure that really is Kingston University’s servers. But it is within a few miles. Hang on, “Milk Marketing Board Sports Ground”? Is it still 1973?

I couldn’t resist playing around with it. As you see above, the Google Map on the freegeoip.net home page looks up your own IP. In my case, I was lucky enough to be able to spot a problem. This location is a little residential street, and certainly doesn’t look like the location of some big university server. In fact, the database returns the “city” as Kingston (sounds good so far) and the “region_name” as St Helens (errr, not really, that’s a few hundred miles north west from here). But Google Maps has found an apartment block called St Helens, near Kingston. Oh dear.

So, lesson 1: don’t trust the Google Map search on the home page.

However, there are also latitudes and longitudes, so I went to my website at www.animatedgraphs.co.uk (WordPress jealously guard the IP addresses of visitors; as ever, data=$$$) and got all the recent IPs of visitors. I shoved these into Heuristic Andrew’s function and found four that returned an Error 404 from the database. That’s not Andrew’s fault, it seems the database just doesn’t know them. As he says, the function is very new and needs some better error handling. I’m sure that will come in time. For now, I just ditched those four and carried on. The country names supplied to me by my ISP matched the freegeoip ones perfectly, so I took the latitudes and longitudes and put them in a map:

How did I get the data into a JavaScript map so quickly? That's another story - coming soon.

How did I get the data into a JavaScript map so quickly? That’s another story – coming soon. Sorry this is a fixed image but that’s the price of easy blogging.

Now the Kingston University server is apparently located near Stoke-on-Trent, in fact near the village of Kingstone (hmmm…), at a location which is not exactly an internet hub:

Much nicer than a server room

Much nicer than a server room

So, lesson number 2: don’t trust the latitude and longitude too much, although most of them seem fine.

To take a less pathological example: a lot of traffic has come my way from Benton, Kentucky. (Thank you, whoever you are. I hope you found the website useful.) The (latitude,longitude) is given by freegeoip as (36.8596,-88.3367) which is in a field a mile out of town. The 4 decimal place precision is clearly rather spurious.

To leave a comment for the author, please follow the link and comment on their blog: Robert Grant's stats blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)