Download shapefiles from ESRI ArcGIS Online Story Maps

[This article was first published on Jonathan Chang, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Liz recently needed some shapefiles from an ArcGIS online map. Checking out the linked page, it’s immediately clear that there’s a lot of data, and no obvious way to get it from a download or share link anywhere on the app page. The desired solution is anything but taking a screenshot and tracing it in ImageJ, since that’s an absolute last resort.

In this post, I’ll walk through how I managed to get those shapefiles downloaded, and hopefully provide some easy tips to do the same for other ArcGIS online maps.

The power of the web inspector

This is fundamentally a web scraping task, and I’ll start with opening the web developer tools in Firefox, by right-clicking a promising bit on the page (the map itself) and selecting “Inspect”. Looking through the HTML tree in the web inspector panel that pops up, I can see that while the shapefiles do appear to exist locally, these are parsed into a gnarly embedded SVG object. This could be used to reconstruct the shapefile, but it seems like a big pain that I don’t want to deal with, so I move on from this avenue.

Screenshot of an HTML source code tree, showing a complex SVG object.

Next, I’ll check out the network tab. I’ll need to refresh the page, and I can see that there are a ton of requests that go to a lot of different places. But, I suspect that any shapefile that’s loaded will likely be downloaded via XHR, initiated from Javascript, and quite possibly hitting some API endpoint that probably speaks in JSON. I filter by JS and XHR and immediately see an request that pops out at me, to an endpoint at services.arcgis.com called data with a query payload of f=json. Inspecting that response object leads me to another API endpoint that appears to be what I want!

Screenshot of the network panel of the web developer console, showing a JSON response object with interesting URL fields.

ESRI API endpoints

I’m actually fairly familiar with ESRI’s REST APIs, and I know that I can actually navigate to the API endpoint and it’ll provide a fairly good description of its data. I can also interactively query it in the browser, without having to muck about with cURL in Terminal or anything like that. ESRI is quite humane in this respect, but again, there doesn’t seem to be an easy way to download the full shapefile directly from this endpoint, and I don’t feel quite up to the task of writing out a shapefile by copying and pasting a bunch of stuff.

Screenshot of the ESRI REST API query tool, showing the result of a query with complex shapefile geometries.

A quick Google sojourn leads me to pyesridump, a wonderful tool by the folks over at OpenAddresses. This is actually exactly what I needed! Install the esri2geojson command with pipx:

% pipx install esridump
  installed package esridump 1.11.0, installed using Python 3.10.8
  These apps are now globally available
    - esri2geojson
done! ✨ 🌟 ✨
% esri2geojson “https://services.arcgis.com/8df8p0NlLFEShl0r/ArcGIS/rest/services/FHA_Grades/FeatureServer/0” fha.geojson
2022-11-03 23:42:54,990 - cli.esridump - INFO - Built 1 requests using resultOffset method

Now to fire up R and see that everything looks right by plotting it.

> library(sf)
Linking to GEOS 3.10.2, GDAL 3.4.2, PROJ 8.2.1; sf_use_s2() is TRUE
> library(ggplot2)

> xx <- read_sf(“fha.geojson”)

> xx
Simple feature collection with 74 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -77.188 ymin: 38.79005 xmax: -76.8772 ymax: 39.0666
Geodetic CRS:  WGS 84
# A tibble: 74 × 5
     FID Grade Shape__Area Shape__Length                         geometry
<int> <chr>       <dbl>         <dbl>                    <POLYGON [°]>
1     1 E5       2268576.         5849. ((-76.90432 38.85715, -76.9018 …
2     2 G7      13563378.        19322. ((-76.93371 38.87391, -76.90942…
3     3 H2       7772476.        12002. ((-76.88671 38.90218, -76.89007…
4     4 H1      12964128.        20269. ((-76.90942 38.89269, -76.93095…
5     5 G1       6516531.        19844. ((-76.93428 38.88311, -76.93574…
6     6 C4       7199183.        16914. ((-76.93371 38.87391, -76.96229…
7     7 H2       7328078.        14489. ((-76.96229 38.85169, -76.97798…
8     8 E2       9790479.        21957. ((-76.98859 38.8399, -76.9885 3…
9     9 F2       5253684.        15352. ((-76.99618 38.85609, -77.00305…
10    10 H1       1810343.         8218. ((-76.97203 38.89815, -76.9833 …
# … with 64 more rows
ℹ Use print(n = ...) to see more rows

> ggplot(xx) + geom_sf(aes(fill = Grade)) + theme_minimal()
Plot of the FHA shapefile dataset of Washington, DC.

It looks fantastic, and is ready for further data analysis now!

A more direct route

After all of this, I was curious if there was a better way, so I did some digging. This ArcGIS Online tool is called ESRI Story Map Series, and the source code is actually available on GitHub. Looking through the repository we can see it’s a Javascript app with a fairly rich library API, intended for ESRI’s customers to develop “story maps” with deep integrations to justify their hefty enterprise contracts. In the README, one of the code suggestions points in an interesting direction, and I reopened the web inspector console to check it out.

Based on the README example, I learned that the top-level object is called app, and that layers can be obtained through a method on the app.map object. I grub around in the app’s internal data structures using the Javascript console, and discover an interesting _layers key inside this object, which seems to have the relevant data that I’m interested in.

Screenshot of the console panel of the web developer console, showing a Javascript data object corresponding to the ESRI map being shown in the map app.

The full invocation in the Javascript console to get the ESRI REST API endpoint is therefore:

app.map._layers.FHA_Grades_4159.url
// "https://services.arcgis.com/8df8p0NlLFEShl0r/arcgis/rest/services/FHA_Grades/FeatureServer/0" 
To leave a comment for the author, please follow the link and comment on their blog: Jonathan Chang.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)