Visualizing international demographic indicators with idbr and Plotly

January 28, 2016
By

(This article was first published on Kyle Walker, and kindly contributed to R-bloggers)

It’s been a while since I last posted here – but I’ve been working on a new R package that I’m quite excited about, and I thought this would be the right place to post. My new package, idbr, is an R interface to the United States Census Bureau’s International Data Base API. The IDB includes a host of international demographic indicators – including historical data and projections to 2050. I use IDB data all of the time for my teaching – and idbr makes the process of getting the data much easier! While this product uses the Census Bureau Data API, it is not endorsed or certified by the Census Bureau.

Install from CRAN with the following command:

install.packages('idbr')

To get started, you’ll need a Census API key; this can be obtained from http://api.census.gov/data/key_signup.html if you don’t already have one. Before downloading data, set your API key for your idbr session with the set_api_key function:

library(idbr)

idb_api_key('Your API key goes here')

There are two main functions in the idbr package. idb1 fetches population data by one-year age bands for one or more countries in one or more years, optionally by age ranges or by sex. idb5 has a lot more indicators available, including total fertility rate, life expectancy, and population by five-year age ranges. To view all of the variables available in the idb function, call idb_variables(). Groups of similar variables, termed concepts, can be fetched at once; see the available concepts with idb_concepts().

Below are some examples of how to use the package with Plotly’s fantastic new R client. Browse the code for idbr at https://github.com/walkerke/idbr, and please let me know if you have any feedback!

Please note: the embedded visualizations are crashing my browser on my mobile device, so I’ve set it so they won’t show up on phones. To view the graphics, take a look at the post on your computer.

World map of infant mortality rates by country for 2016:

library(plotly)
library(viridis)

df <- idb5(country = 'all', year = 2016, variable = 'IMR', country_name = TRUE)

plot_ly(df, z = IMR, text = NAME, locations = NAME, locationmode = 'country names',
        type = 'choropleth', colors = viridis(99), hoverinfo = 'text+z') %>%
  layout(title = 'Infant mortality rate (per 1000 live births), 2016', 
         geo = list(projection = list(type = 'robinson')))

Projected population pyramid of China in 2050:

library(dplyr)

male <- idb1('CH', 2050, sex = 'male') %>%
  mutate(POP = POP * -1,
         SEX = 'Male')

female <- idb1('CH', 2050, sex = 'female') %>%
  mutate(SEX = 'Female')

china <- rbind(male, female) %>%
  mutate(abs_pop = abs(POP))

plot_ly(china, x = POP, y = AGE, color = SEX, type = 'bar', orientation = 'h',
        hoverinfo = 'y+text+name', text = abs_pop, colors = c('red', 'gold')) %>%
  layout(bargap = 0.1, barmode = 'overlay',
         xaxis = list(tickmode = 'array', tickvals = c(-10000000, -5000000, 0, 5000000, 10000000),
         ticktext = c('10M', '5M', '0', '5M', '10M')), 
         title = 'Projected population structure of China, 2050')

Life expectancy at birth by sex compared in a Shiny app

First, get the data from idbr then save out (so you don’t have to call the API each time):

# setup.R

library(idbr)

idb_api_key("Your API key here")

full <- idb5(country = 'all', year = '2016', variables = c('E0_F', 'E0_M'), country_name = TRUE)

save(full, file = 'idbr_data.rds')

Next, build the app:

# app.R

library(shiny)
library(countrycode)
library(plotly)
library(dplyr)
library(tidyr)

load('idbr_data.rds')

ui <- fluidPage(

  titlePanel("Life expectancy at birth by country and sex"),

  sidebarLayout(
    sidebarPanel(
      selectInput("region",
                  "Select region to plot:",
                  choices = sort(unique(countrycode_data$region)),
                  selected = 'Northern Africa')
    ),

    mainPanel(
      plotlyOutput("dumbbell")
    )
  )
)

server <- function(input, output) {

  regiondf <- reactive({

    reg <- countrycode_data[countrycode_data$region == input$region, ]

    fips <- reg$fips104

    sub <- full %>%
      filter(FIPS %in% fips) %>%
      rename(Male = E0_M, Female = E0_F) %>%
      arrange(Female)

    sub

  })

  output$dumbbell <- renderPlotly({

    regiondf() %>%
      gather(Sex, value, Male, Female) %>%
      plot_ly(x = value, y = NAME, mode = 'lines',
              group = NAME, showlegend = FALSE, line = list(color = 'gray'),
              hovermode = FALSE, hoverinfo = 'none') %>%
      add_trace(x = value, y = NAME, color = Sex, mode = 'markers',
              colors = c('darkred', 'navy'), marker = list(size = 10)) %>%
      layout(xaxis = list(title = 'Life expectancy at birth'),
             yaxis = list(title = ''),
             margin = list(l = 120))

  })

}

shinyApp(ui = ui, server = server)

To leave a comment for the author, please follow the link and comment on their blog: Kyle Walker.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)