census 2020: some quick visuals of demographic change
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Intro
A quick/simple post: using the PL94171 package to access Census 2020 counts. Census data won’t be API-accessible until ~late September; these data are available, however, for redistricting purposes – albeit in a funky format. The
PL94171
package can be used to download and re-structure these files for super convenient use. (!) Some quick visualizations of demographic change in the state of New Mexico.
Get data
Using several of the functions included in the PL94171
package, we build a simple wrapper below to extract census counts for multiple census years.
library(tidyverse) get_pl_data <- function(x, state, level){ y <- PL94171::pl_read(PL94171::pl_url(state, x)) pl <- PL94171::pl_subset(y, sumlev = level) PL94171::pl_select_standard(pl, clean_names = TRUE) }
The we apply the function, collecting county-level data for the state of New Mexico for the last three decennial censuses.
yrs <- c(2000, 2010, 2020) x0 <- lapply(yrs, get_pl_data, state = 'NM', level = '050') names(x0) <- yrs x1 <- data.table::rbindlist(x0, idcol = 'year')
County level change by sub-group: 2000-2020
Below we re-structure the data some, and download New Mexico county information via the tigris
package.
x2 <- x1 %>% select(year, GEOID, pop:pop_two) %>% gather(key = 'race', value = 'count', -year:-pop) %>% mutate(prop = count/pop) head(x2) %>% knitr::kable()
year | GEOID | pop | race | count | prop |
---|---|---|---|---|---|
2000 | 35001 | 556678 | pop_hisp | 233565 | 0.4195693 |
2000 | 35003 | 3543 | pop_hisp | 679 | 0.1916455 |
2000 | 35005 | 61382 | pop_hisp | 26904 | 0.4383044 |
2000 | 35006 | 25595 | pop_hisp | 8555 | 0.3342450 |
2000 | 35007 | 14189 | pop_hisp | 6739 | 0.4749454 |
2000 | 35009 | 45044 | pop_hisp | 13685 | 0.3038140 |
nm <- tigris::counties(state = 'NM', cb = T)
A homemade palette:
gen_pal <- c('#ead8c3', '#eeeeee', '#437193', '#7da6aa', '#b0bcc1', '#55752f', '#dae2ba', '#eb7f6b' )
Population change by sub-group for a sample of counties in New Mexico:
set.seed(999) x2 %>% inner_join(nm %>% sample_n(12)) %>% ggplot(aes(x = as.integer(year), y = prop, fill = race)) + geom_area(stat = "identity", color = 'white', alpha = 0.85) + geom_hline(yintercept = .5, linetype = 4, color = 'white') + scale_fill_manual(values = gen_pal) + scale_x_continuous(breaks=seq(2000,2020,10)) + xlab('') + ylab('') + theme_minimal() + theme(legend.position="bottom", legend.title = element_blank()) + facet_wrap(~NAME, ncol = 4) + ggtitle('Population composition by county: 2000-2020')
A quick map
x3 <- x2 %>% filter(race == 'pop_hisp' &year != 2000) %>% select(year, GEOID, prop) %>% spread(year, prop) %>% mutate(delta = round(`2020` - `2010`, 3)) head(x3) %>% knitr::kable()
GEOID | 2010 | 2020 | delta |
---|---|---|---|
35001 | 0.4785787 | 0.4870780 | 0.008 |
35003 | 0.1903356 | 0.1682034 | -0.022 |
35005 | 0.5200548 | 0.5693479 | 0.049 |
35006 | 0.3650461 | 0.3181216 | -0.047 |
35007 | 0.4718545 | 0.4745297 | 0.003 |
35009 | 0.3951753 | 0.4500516 | 0.055 |
The map below details percent change for the Hispanic population in New Mexico from 2010 to 2020.
nm %>% left_join(x3) %>% ggplot() + geom_sf(aes(fill = delta), color = 'darkgray', alpha = .85, lwd = .2) + scale_fill_distiller(palette = "BrBG", limit = max(abs(x3$delta)) * c(-1, 1)) + theme_minimal() + theme(axis.title.x=element_blank(), axis.text.x=element_blank(), axis.title.y=element_blank(), axis.text.y=element_blank(), legend.title=element_blank(), legend.position = 'bottom', complete = F) + labs(title = 'Percent Change Hispanic Population in New Mexico', subtitle = 'by County: 2010 to 2020')
Census 2020
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.