d3/R Streamgraph on White House Petition Data

April 14, 2016
By

(This article was first published on data_steve, and kindly contributed to R-bloggers)

After last week’s post with the d3-htmlwidget chordplot showing the network of petition categories, in this post I decided to show the time trend aspect of how the petitions came in. I’m using @hrbrmstr’s htmlwidgets wrapper of a d3 streamgraph to help visualize the macro- and tag-specific trends.

Compared to the chordplot, the time trend allowed by the streamgraph gives a whole new angle of data. Something substantial occurred around Nov 2012 to increase petition monthly submissions by 15+ fold. I’ll explore this more in my next post.

If you like what you see, below try it yourself with the code below or follow along with @hrbrmstr’s vignette.

Set up
# load
pacman::p_load(dplyr, magrittr, ggplot2, tidyr)
pacman::p_load_gh("hrbrmstr/streamgraph")
home <- "~/Desktop/whitehouse_gov/"
load(file.path(home,"data/petitions.RData"))


# get data if you need
# data_url <- "https://github.com/yoni/r_we_the_people/blob/master/data/petitions.RData?raw=true"
# curl::curl_download(data_url, destfile=file.path(home,"data/petitions.RData"))
# this behaved much better than download.file() for me


# reshape
trend <- petitions %>%
    select(id, contains(".name"), created_POSIXct) %>%
    mutate(issues1.name= ifelse(is.na(issues1.name), issues.name, issues1.name)) %>%
    mutate_each(funs(ifelse(is.na(.), "", .)), starts_with("issues"))  %>%
    select(-issues.name) %>%
    tidyr::gather(order, tag, -c(id,created_POSIXct )) %>%
    filter(tag!="") %>%
    transmute(
      dates = as.Date(paste(lubridate::year(created_POSIXct)
                                       ,lubridate::month(created_POSIXct)
                                       ,"01", sep="-")) ,
               order = tidyr::extract_numeric(order),
               tag = tag) %>%
    group_by(dates, tag) %>%
    summarize(n=n()) %>% ungroup 
Streamgraph: Basics

Here’s the basic set up for the streamgraph. As last time, I needed more colors
than the basic RColorBrewer sets offered, so I expanded the set with gradients of it.

# set number of colors needed
colorCount <- length(unique(tt2$tag))

# makes function to create palette
getPalette <- grDevices::colorRampPalette(RColorBrewer::brewer.pal(9, "Set3"))

# streams = counts
trend  %>%
   streamgraph("tag", "n", "dates") %>%
   sg_axis_x(2, "month/year", "%m / %Y") %>%
   sg_fill_manual(values=getPalette(colorCount))

This view of the streamgraph shows a dramatic expansion of petitions being submitted
around Nov 2012. Something substantial must have happened to get the word out about it.

Streamgraph: Options

The y-axis on the graph above is hard to interpret, try the offset="zero" option.
You can also experiment with interpolate= to get linear or step trends. @hrbrmstr’s vignette has several examples.

# use other streamgraph options
trend  %>%
   streamgraph("tag", "n", "dates"
       , offset="zero"
           # ,interpolate="linear") %>%
   sg_axis_x(2, "month/year", "%m / %Y") %>%
   sg_fill_manual(values=getPalette(colorCount))

Sometimes I think its actually more interesting to see the streams showing the relative
percentage each category had of the share of petitions coming in then. Though the raw magnitude
is obscured, sometimes it allows for other trends to become more salient. The masthead image is
what you get with this example.

# streams as percentages instead of counts
trend %>%
  group_by(dates) %>%
  mutate(prop = n/sum(n)) %>%
  new_streamgraph("tag", "prop", "dates") %>%
  sg_axis_x(2, "month/year", "%m / %Y")%>%
  sg_fill_manual(values=getPalette(colorCount))  

In the last post with the d3-htmlwidget chordplot I ended by showing you how to save
the html via the RStudio Viewer options. Below I show how to use htmlwidgets::saveWidget to
maintain programmatic control over the process.

# save your streamgraph to html file
trend  %>%
  streamgraph("tag", "n", "dates") %>%
  sg_axis_x(2, "month/year", "%m / %Y") %>%
  sg_fill_manual(values=getPalette(colorCount))%>%
  htmlwidgets::saveWidget(file.path(home, "test.html"))

I personally found that the default text setting on the streamgraph was too small once
its opened in the browser. I tried for some considerable time to programmatically change the font size
natively from the streamgraph package. It appears not to support this in its params suboptions, though I have seen
such elsewhere.

So I created a (ugly) hack on the html. It goes into the css script for streamgraph and edits the font tag. You must have already saved the widget to html to use this function. If anyone has a better way, tweet me [@data_steve][twitter.com/@data_steve]

   change_font <- function(file, font){
    css_ptn = 'svg%20text%20%7B%0Afont%3A%20\d+?px'
    txt <- readLines(file)
    prac <- txt[grep(css_ptn, txt)]
    txt[grep(css_ptn, txt)] <- gsub(css_ptn, paste0('svg%20text%20%7B%0Afont%3A%20',font,'px'), txt[grep(css_ptn, txt)])
    writeLines(txt, file)
  }


  change_font(file.path(home, "test.html"), 15)

To leave a comment for the author, please follow the link and comment on their blog: data_steve.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)