Our article (by Yu-Sung, Jennifer, Masanao, and myself, and based also on work with Kobi, Grazia, and Peter Messeri) will be appearing in the Journal of Statistical Software, in a special issue on missing-data imputation. Here's the abstract: ...

Something that's very important to be able to do in data analysis and visualization is to filter out cases. Let's say you want to do identical analyses of two different groups, or of one group and then a subset of it. R can do this a little differently; instead of merely filtering out cases you can create an object...

Development of the R package tikzDevice has been underway for about a month now. This package allows for the output of R graphics as TikZ commands. Charlie Sharpsteen and I have gotten it into an alpha stage. There is no real documentation but there is plenty of comments in the code. We have a R-forge

Before we delve into slightly more advanced plotting commands I want to talk a little about linear models, specifically, linear regression. In R this is very, very simple. For instance, in our 'states' data frame, we might want to look at median household income as a predictor of state education expenditures. The command lm calculates this for us. We'll...

Using Microsoft Excel I'm collecting aggregate data, by state, of various social, political, and economic indicators. I export them into a tab-delimited file called 'states.txt' (pretty clever, I know.) I've got data on education expenditures, firearm deaths per capita, median household income, etc. I'd like to do some analysis and graphing of these data to see if there are...

In R, the way to delete a component in a list object is different from matrix and vector objects. For a vector, to delete an element:vec <- c(1, 2, 3)vec <- vecFor a matrix, to delete a row or a column:mat <- matrix(c(1,2,3,4), 2, 2)mat2 <- mat # delete a rowmat3 <- mat # delete a columnFor a list,...

I've decided that this summer I will finally break down and force myself to learn a little bit about using R. I currently use Stata, a very good program, but the idea of R is appealing since it's free under the GNU license. It has a large and active us...

For readers at Vanderbilt: At yesterday's R course I found out that Theresa Scott in the Biostatistics department holds a weekly R clinic and encourages new R users who want to learn more to bring any questions about R, or even your own code and data. The R clinic is held weekly on Thursday from 2:00-3:00 in MCN....

It's useful to look at scatterplots even when the "y" variable is dichotomous. For example, this can help determine whether categorization or linear assumptions would be more plausible. However, an unmodified scatterplot is less than helpful, since all of the "y" values are either 0 or 1, and are hard to separate visually. Some jittering...

Writing from the previously mentioned intro to R course at the Kennedy Center. If you couldn't make it you can download all the course materials from Theresa Scott's website, under the "Current Teaching Material" heading. Here is a direct link to the PDF for the overview materials that we're going over today, along with the R code...

I had been wondering what impact my friending 200 people from my Gmail address book had, so I scraped the dates from the notification emails. The plot shows notifications of friend requests from other people to me in black and confirmations of my requests to other people in red. That sudden and sharp increase at...

I finally have time to try parallel computing in R using snowfall/snow thanks to this article in the 1st issue of R journal, which replaces R news. I didn’t try it before because i didn’t have a good toy example, and it seemed like a steep learning curve (i only guessed what parallel computing was).

What do you use for network analysis? I found the Wikipedia list of network software entirely overwhelming. I wanted to test out some of the introductory tools, but avoid the trap of sinking my time into a dead-end software project. (Remember learning Minitab in freshman statistics? How often do you use Minitab today for anything

bugsparallel is a Metrum Institute project to run BUGS (via R2WinBUGS) in parallel - McMC is an application, where parallel runs can be used very efficientlly. Here is the code for one example using bugsparallel.Some usefull links:Rosenthal, Parallel c...

Introduction In Part 1 of this tutorial we introduced the fechell library by extracting all itemized contributions from individuals made to the Obama For America campaign in 2007 and 2008. In Part 2 of the tutorial we will summarize that data set by importing it into a MySQL database and aggregating contributions by week and

Pew Research has found that 79% of Americans believe in The Second Coming of Jesus. What worries me more is not that 4 out of 5 Americans believe in The Second Coming, but that 1 out of 5 believes it will happen in their lifetime. It seems inevitable that such a belief will grossly warp

