What’s the one thing that help you add value to your company’s raw geospatial data? GEOCODING.
Geocoding is the process of converting raw physical addresses to latitude and longitude geospatial points that can be viewed on a map and used for geospatial calculations. Heck – Geocoding has been known to increase my machine learning model perfomance by up to 10%!
Table of Contents
Today I’m going to show you how to do Geocoding in R for FREE using
tidygeocoder. Here’s what you’re learning today:
- Tutorial Part 1: How to use
tidygeocoderto effortlessly geocode addresses (convert your company addresses to Lat/Long)
- Tutorial Part 2: And I’m going to show you how to do Reverse Geocoding (go from Lat/Long to Physical Addresses)
- Bonus: I’m going to show you how to Map lat/long data using Simple Features + Mapview!
This article is part of R-Tips Weekly, a weekly video tutorial that shows you step-by-step how to do common R coding tasks. Pretty cool, right?
Here are the links to get set up. 👇
This Tutorial is Available in Video
I have a companion video tutorial that gives you the bonus Mapview Shortcuts shown in this video (plus walks you through how to use it). And, I’m finding that a lot of my students prefer the dialogue that goes along with coding. So check out this video to see me running the code in this tutorial. 👇
Why Geocoding is a Must
Look, I’ve been working with customer data for a long time…
And one of the RICHEST sources of data is raw company addresses!
Think about it. If you know where a company is located, do you think that might be important to their purchasing behavior?
Well it was for me. In fact I found out that just simply adding the Latitude and Longitude information to my customer churn prediction models…
Gave my models a 10% increase in performance!
Lot’s of Value to Machine Learning in Raw Customer Addresses
The Latitude and Longitude was key!
And that’s just one of the benefits of working with geospatial data (and geocoding).
But you’re probably thinking geospatial data is really tough.
Listen, I get it. Geospatial data is a little weird.
But, you have good ole Matt Dancho to help you out.
And my promise is today, I’m going to get you on the right track.
So let’s fix that geospatial problem, and make one small step today. And it starts with geocoding.
Thank You to the Developer (and Community).
Before we do our deep-dive into
tidygeocoder, I want to take a brief moment to thank the developers working on theTidygeocoder project, Jesse Cambon, Diego Hernangómez, Christopher Belanger and Daniel Possenriede. Without their hard work, this tutorial (and easy Geocoding) wouldn’t be possible. Thank you!
Free Gift: Cheat Sheet for my Top 100 R Packages (Special Geospatial Analysis Topics Included)
Before we dive in…
You’re going to need R packages to complete the geospatial analysis that helps your company. So why not speed up the process?
To help, I’m going to share my secret weapon…
Even I forget which R packages to use from time to time. And this cheat sheet saves me so much time. Instead of googling to filter through 20,000 R packages to find a needle in a haystack. I keep my cheat sheet handy so I know which to use and when to use them. Seriously. This cheat sheet is my bible.
Once you download it, head over to page 3 and you’ll see several R packages I use frequently just for Data Analysis.
Which is important when you want to work in these fields:
- Machine Learning
- Time Series
- Financial Analysis
- Geospatial Analysis
- Text Analysis and NLP
- Shiny Web App Development
So steal my cheat sheet. It will save you a ton of time.
Tutorial: How to Geocode in R for Free with
Time for geocoding with
tidygeocoder. Let’s have some fun!
Step 1: Load the Libraries
Load the following libraries.
tidygeocoderare the main libraries.
- But my bonus lat/long map hack uses
Step 2: Get My Pittsburgh Pharmacies Dataset
Next, you can steal my Pittsburgh Pharmacies dataset. This dataset is a great way to test your skills with Geocoding.
We’ll the Pittsburgh Pharmacies dataset (171 geocoded pharmacies) throughout the rest of this tutorial.
Get it here. It’s in the
Next, read the data set into R.
Step 3: Geocode the Address Column to get Latitude and Longitude
Next, use the
geocode() function to convert a company’s physical address to a Latitude / Longitude.
Here’s what happens…
Step 4: Reverse Geocode to go from Lat/Long to Physical Address
Sometimes you have a latitude and longitude and want a physical address. For example, if your salesperson needs to know what addresses to visit (you wouldn’t send them a Lat/Long… or else they’d think your nuts!)
Did you know that you can reverse geocode?
You can! Here’s how to go from Latitude / Longitude to a Physical Address. (And save your inter-office reputation)
And you can see that reverse geocoding creates an address from Lat/Long coordinates.
Bonus: Steal My Map Hack to Visualize Lat/Long Data
Want to visualize the geocoded data?
Steal my bonus script here. (It’s in the
Here’s what it does in 2 lines of code:
Now you can visualize all 171 Pittsburgh Pharmacies in an interactive map!
You learned how to use the
tidygeocoder library to geocode and reverse geocode. Great work! But, there’s a lot more to becoming a data scientist.
If you’d like to become a Business Data Scientist (and have an awesome career, improve your quality of life, enjoy your job, and all the fun that comes along), then I can help with that.
Do You Need Help Becoming A Business Data Scientist Right Now?
YOU know the feeling. Being unhappy with your current job.
Promotions aren’t happening. You’re stuck. Hopeless. Confused…
And you’re praying that the next data science interview will go better than the last 12…
… But you know it won’t. Not unless you take control of your career.
The good news is…
I Can Help You Speed It Up.
I’ve helped 5,897+ students learn data science for business from an elite business consultant’s perspective.
I’ve worked with Fortune 500 companies like S&P Global, Apple, MRM McCann, and more.
And I built a training program that gets my students life-changing data science careers (don’t believe me? see my testimonials here):
6-Figure Data Science Job at CVS Health ($125K)
Senior VP Of Analytics At JP Morgan ($200K)
50%+ Raises & Promotions ($150K)
Lead Data Scientist at Northwestern Mutual ($175K)
2X-ed Salary (From $60K to $120K)
2 Competing ML Job Offers ($150K)
Promotion to Lead Data Scientist ($175K)
Data Scientist Job at Verizon ($125K+)
Data Scientist Job at CitiBank ($100K + Bonus)
Whenever you are ready, here’s how I can help you:
Here’s the system that has gotten aspiring data scientists, career transitioners, and life long learners data science jobs and promotions…
P.S. – Samantha landed her NEW Data Science R Developer job at CVS Health (Fortune 500). This could be you.