R-bloggers

Choroplethr v4.0.0 is now on CRAN

[This article was first published on R – Ari Lamstein, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

choroplethr version 4.0.0 is now on CRAN. You can install it like this:


install.packages("choroplethr")

packageVersion("choroplethr") # [1] ‘4.0.0’

With this version, I have transferred the maintenance of choroplethr to Zhaochen He, an economics professor at Christopher Newport University. Zhao addressed the issues that led to choroplethr being archived from CRAN in February. Please join me in thanking Zhao for his contribution!

Changes in v4.0.0

The primary changes in this version are:

Future Work

Future development of choroplethr will be decided by Zhao. That’s the best part of being a maintainer, and I don’t want to take it away from him. But I also want to note three issues I see facing the project: outdated maps, simple features and backwards compatibility.

Outdated Maps

When Choroplethr was released in 2014, one of its contributions was packaging contemporary maps and making them accessible to the entire R ecosystem. (For reference, at that time, the world map that shipped with the maps package still included the USSR!)

Unfortunately these maps are now outdated, and it is causing problems in some cases. For example, in 2022 Census changed the names and boundaries of Connecticut’s county-equivalents (link). Because choroplethr’s county map is from 2010, attempting to map data from 2022 or later generates an error. Here is a code snippet that demonstrates the issue:

library(choroplethr)

#Works
df = get_county_demographics(2021)
df$value = df$population
county_choropleth(df, state_zoom='connecticut')

#Error
df = get_county_demographics(2022)
df$value = df$population
county_choropleth(df, state_zoom='connecticut')

In addition to the county map, I believe that the following maps should also be updated:

Simple Features

Choroplethr was published when the best way to make a map with ggplot2 was by “fortifying” a shapefile. Towards the end of my active development on choroplethr the “Simple Features” package was created. My impression is that there might be advantages to migrating choroplethr to use simple features. Unfortunately, I never got around to researching this.

I think it would be useful if someone researches Simple Features and generates an informed opinion on whether (and how) to incorporate it into choroplethr.

Backwards Compatibility

One of the highlights of developing choroplethr was when the US Census Bureau commissioned a video course on it. That course is still on their website (link), and it would be awkward if the instructions in it somehow stopped working.

That said, the course was published in 2016 and should not prevent genuine innovation. I am not sure the best way to balance this, and it’s possible that I became too conservative about breaking backwards compatibility after the course was released.

I hope that Zhao can find a balance between innovation and respecting backwards compatibility.

Example

Back in choroplethr’s heyday I would include an example in each post. This might be my last post about choroplethr, and I thought it would be nice to include an example here as well.

Since the output of the functions that get state, county and tract demographics have changed, let’s use them. We can use those functions to explore how the median household income in the US changed between 2009 (the first 5-year ACS) and 2023 (the last 5-year ACS).

Change in State Median Household Income

Here’s how to map the percent change in median income in each state between 2009 and 2023:

library(choroplethr)
stopifnot(packageVersion("choroplethr") >= '4.0.0')

#Get data from 2009 and 2023
df_2009 = get_state_demographics(2009, 5)
df_2009$value = df_2009$median_hh_income
df_2023 = get_state_demographics(2023, 5)
df_2023$value = df_2023$median_hh_income

#Calculate and map percent change
df_final = calculate_percent_change(df_2009, df_2023)
state_choropleth(df_final,
    title = 'Change in Median Household Income: 2009 to 2023',
    legend = 'Percent Change',
    num_colors = 4)

This map really surprised me. Virtually all the states in the top quartile are in the western half of the country. And many of the exceptions to that rule (Nevada, Wyoming and New Mexico) are in the lowest quartile.

Change in County Median Household Income

For the county map, let’s zoom in on the five counties that make up New York City. Using a continuous scale will help us see the magnitude of the difference between each county. (Recall that when working with counties you need to use FIPS codes).

df_2009 = get_county_demographics(2009)
df_2009$value = df_2009$median_hh_income
df_2023 = get_county_demographics(2023)
df_2023$value = df_2023$median_hh_income

df_final = calculate_percent_change(df_2009, df_2023)

nyc_counties = c(36005, 36047, 36061, 36081, 36085)
county_choropleth(df_final,
    title = 'Change in Median Household Income: 2009 to 2023\nCounties in New York City',
    legend = 'Percent Change',
    num_colors = 1,
    county_zoom = nyc_counties)

Unfortunately county_choropleth doesn’t print the names of the counties, so you need to have some familiarity with New York City to understand this map. The darkest county is Brooklyn (Kings County), which had an increase in median household income of 83% in just 14 years!

Change in Tract Median Household Income

I wondered if all of Brooklyn had a large increase in income, or just parts of it. We can answer that question by analyzing the census tracts in Brooklyn:

df_2009 = get_tract_demographics("new york", 36047, 2009)
df_2009$value = df_2009$median_hh_income
df_2023 = get_tract_demographics("new york", 36047, 2023)
df_2023$value = df_2023$median_hh_income

df_final = calculate_percent_change(df_2009, df_2023)

tract_choropleth(df_final,
    "new york", 
    county_zoom=36047,
    legend="Percent Change",
    title="Change in Median Household Income: 2009-2023\nCensus Tracts in Brooklyn, NY")

It appears that the northern half of Brooklyn experienced a much larger increase in median household income than the southern half. This might be due to its close proximity to Manhattan (northern Brooklyn is connected to Manhattan by three bridges and a tunnel). Also note the range of the scale: it goes from -40.5% to 362.9%!

Conclusion

Thank you again to Zhaochen He for becoming the new maintainer of choroplethr. As I mentioned here, choroplethr was downloaded over 1,500 times in the month before it was archived. Archiving it impacted a significant number of users. Thanks to Zhaochen’s efforts, the package is now back on CRAN.

If you would like to learn more about choroplethr, then please visit this page. There you will find a recording of a webinar I gave about choroplethr at the CDC, as well as links to three (now free) courses I created on how to use choroplethr.

While I have disabled comments on my blog, I welcome hearing from readers. Use this form to contact me.

To leave a comment for the author, please follow the link and comment on their blog: R – Ari Lamstein.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version