## Ryan Peek on using xts and ggplot for time-series data

February 6, 2013
At Davis R Users’ Group today, Ryan Peek gave a presentation on how he takes data from his field instruments and visualizes it in R. Here are his notes. The original *.Rmd file and data can be found here SHORT HOW-TO ON USING XTS AND GGPLOT FOR TIME SERIES DATA XTS is a very helpful package...

## Set operations on more than two sets in R

February 6, 2013
ProblemSet operations are a common place thing to do in R, and the enabling functions in the base stats package are:intersect(x, y)union(x, y)setdiff(x, y)setequal(x, y)That said, you'll note that each ONLY takes two arguments - i.e. set X and set Y - ...

## Modelling memory and news trajectories

February 6, 2013
Modelling memory In the text below I present two models I've made to quantify and visualise the diverging trajectories of memory and news events, and conclude that linear regression may be used to test which model best describes the story. First, though, I contextualise this with an illustration from the...

## Make building R packages easier with devtools

February 6, 2013
If you're writing any significant amount of R code, you might want to start think about bundling it up into packages. An R package combines functions, data, documentation and unit tests, and is a convenient and reliable system to manage and version collections of R content that could otherwise become unwieldy. And if you want to share your code...

February 5, 2013
Some little birds had already been whispering about it, but I didn't want to jinx it and told myself I would wait with an announcement until the booksellers have (at least) placeholder pages. And as I learned from Duncan Murdoch via email earlier toda...

## Collinearity and stepwise VIF selection

February 5, 2013
Collinearity, or excessive correlation among explanatory variables, can complicate or prevent the identification of an optimal set of explanatory variables for a statistical model. For example, forward or backward selection of variables could produce inconsistent results, variance partitioning analyses may be unable to identify unique sources of variation, or parameter estimates may include substantial amounts

## Visualizing networks in R: arc diagrams and hive plots

February 4, 2013
Arc diagrams are an alternate way of representing two-dimensional graphs. Rather than scattering the nodes across the page connected by straight edges, you can instead arrange the nodes along a one-dimensional axis, and replace the straight edges with arcs between the nodes. While an arc diagram might not give as good a sense of the connections between the nodes...

## analyze the survey of income and program participation (sipp) with r

February 4, 2013
if the census bureau's budget was gutted and only one complex sample survey survived, pray it's the survey of income and program participation (sipp).  it's giant.  it's rich with variables.  it's monthly.  it follows households over three, four, now five year panels.  the congressional budget office uses it for their health insurance simulation.  analysts read that sipp has...

## Data Visualization for Education

February 3, 2013
Recently I was invited to give a talk to two cohorts of Strategic Data Project fellows. I was asked to speak about using data visualization to help inform decision-making of policy makers. At the same time, the group had a lot of variation in their int...