PDF tutorial from R course (Introduction to R)

June 23, 2009
By

Writing from the previously mentioned intro to R course at the Kennedy Center. If you couldn't make it you can download all the course materials from Theresa Scott's website, under the "Current Teaching Material" heading. Here is a direct link to the PDF for the overview materials that we're going over today, along with the R code...

Read more »

I had been wondering what impact my friending 200 people from my…

June 22, 2009
By
I had been wondering what impact my friending 200 people from my…

I had been wondering what impact my friending 200 people from my Gmail address book had, so I scraped the dates from the notification emails. The plot shows notifications of friend requests from other people to me in black and confirmations of my requests to other people in red. That sudden and sharp increase at...

Read more »

Who’s Tweets Do I Read… Magic R Code Says…

June 22, 2009
By
Who’s Tweets Do I Read… Magic R Code Says…

So one glace at my user logs shows the truth: no one gives a rat’s rump that I just quit my job; you just love you some Twitter R code. And I’m nothing but an attention whore, so come get some! So in my last ‘Twitter with R’ post I gave you some code I’d written

Read more »

Parallel computing in R: snowfall/snow

June 20, 2009
By

I finally have time to try parallel computing in R using snowfall/snow thanks to this article in the 1st issue of R journal, which replaces R news. I didn’t try it before because i didn’t have a good toy example, and it seemed like a steep learning curve (i only guessed what parallel computing was).

Read more »

Network Analysis Software: focus on F/OSS

June 20, 2009
By
Network Analysis Software: focus on F/OSS

What do you use for network analysis? I found the Wikipedia list of network software entirely overwhelming. I wanted to test out some of the introductory tools, but avoid the trap of sinking my time into a dead-end software project. (Remember learning Minitab in freshman statistics? How often do you use Minitab today for anything

Read more »

Analysis of Iran absentee votes

June 20, 2009
By
Analysis of Iran absentee votes

On http://www.presstv.com/detail.aspx?id=98206&sectionid=351020101 the official Iranian election results outside of Iran are posted. Here is a bit of exploration of the results.The graph shows the number of votes for Ahmadinejad (x-axis) vs. the nu...

Read more »

R: Function to create tables in LaTex or Lyx to display regression model results

June 19, 2009
By
R: Function to create tables in LaTex or Lyx to display regression model results

Most people using LaTex feel that creating tables is no fun. Some days ago I stumbled across a neat function written by Paul Johnson that produces LaTex code as well as LaTex code that can be used within Lyx. The output can be used for regression models and looks like output from the Stata outreg

Read more »

Iran Election analyzed with R

June 19, 2009
By
Iran Election analyzed with R

Here you can find a very interesting post depicting the R strengths in 'real-time statistics'. I'd like to use the occasion to thank David Smith for hosting the best, imho, blog on R!  Follow Him on Twitter: @revodavid .

Read more »

bugsparallel

June 18, 2009
By

bugsparallel is a Metrum Institute project to run BUGS (via R2WinBUGS) in parallel - McMC is an application, where parallel runs can be used very efficientlly. Here is the code for one example using bugsparallel.Some usefull links:Rosenthal, Parallel c...

Read more »

bugsparallel

June 18, 2009
By

bugsparallel is a Metrum Institute project to run BUGS (via R2WinBUGS) in parallel - McMC is an application, where parallel runs can be used very efficientlly. Here is the code for one example using bugsparallel.Some usefull links:Rosenthal, Parallel c...

Read more »

open-source campaign finance analysis with R and MySQL

June 18, 2009
By
open-source campaign finance analysis with R and MySQL

Introduction In Part 1 of this tutorial we introduced the fechell library by extracting all itemized contributions from individuals made to the Obama For America campaign in 2007 and 2008. In Part 2 of the tutorial we will summarize that data set by importing it into a MySQL database and aggregating contributions by week and

Read more »

The Second Coming

June 18, 2009
By

Pew Research has found that 79% of Americans believe in The Second Coming of Jesus. What worries me more is not that 4 out of 5 Americans believe in The Second Coming, but that 1 out of 5 believes it will happen in their lifetime. It seems inevitable that such a belief will grossly warp

Read more »

The Second Coming

June 18, 2009
By

Pew Research has found that 79% of Americans believe in The Second Coming of Jesus. What worries me more is not that 4 out of 5 Americans believe in The Second Coming, but that 1 out of 5 believes it will happen in their lifetime. It seems inevitable t...

Read more »

Influence.ME: don’t specify the intercept

June 18, 2009
By

Just recently, I was contacted by a researcher who wanted to use influence.ME to obtain model estimates from which iteratively some data was deleted. In his case, observations were nested within an area, but there were very unequal numbers of ...

Read more »

Hierarchical Clustering in R

June 16, 2009
By

Hierarchical clustering is a technique for grouping samples/data points into categories and subcategories based on a similarity measure. Being the powerful statistical package it is, R has several routines for doing hierarchical clustering. The basic command for doing HC is hclust(d, method = "complete", members=NULL) Nearly all clustering approaches use a concept of distance. Data points

Read more »

Not Just Normal… Gaussian

June 16, 2009
By
Not Just Normal… Gaussian

Dave, over at The Revolutions Blog, posted about the big ‘ol list of graphs created with R that are over at Wikimedia Commons. As I was scrolling through the list I recognized the standard normal distribution from the Wikipedia article on the same topic. Below is the fairly simple source code with lots of comments. Here’s

Read more »

NYT: In Simulation Work, the Demand Is Real

June 16, 2009
By

The New York Times published this interesting article on how the ability to design and perform computer simulations is a highly marketable skill for careers across many disciplines.In methodology development we use simulation nearly every day. We've developed our own specialized genetic data simulation software, genomeSIMLA, that's freely available here by request for PC, Mac, and Linux.But if...

Read more »

One outlier and you’re out: Influential data and racial prejudice

June 16, 2009
By
One outlier and you’re out: Influential data and racial prejudice

Currently preparing a presentation on analyzing influential data in mixed effects models myself, my eye fell on an article in which important claims on racial prejudice were refuted. An important aspect of the criticism on existing work, is that in ...

Read more »

R tips: Determine if function is called from specific package

June 16, 2009
By
R tips: Determine if function is called from specific package

I like the "multicore" library for a particular task. I can easily write a combination of if(require("multicore",...)) that means that my function will automatically use the parallel mclapply() instead of lapply() where it is available. Which is grand 99% of the time, except when my function is called from mclapply() (or one of the lower level functions)...

Read more »

R tips: Determine if function is called from specific package

June 16, 2009
By
R tips: Determine if function is called from specific package

I like the "multicore" library for a particular task. I can easily write a combination of if(require("multicore",...)) that means that my function will automatically use the parallel mclapply() instead of lapply() where it is available. Which is grand 99% of the time, except when my function is called from mclapply() (or one of the lower level functions)...

Read more »

Who wants school vouchers? Rich whites and poor nonwhites

June 15, 2009
By

As part of our Red State, Blue State research, we developed statistical tools for estimating public opinion among subsets of the population. Recently Yu-Sung Su, Yair Ghitza, and I applied these methods to see where school vouchers are more or...

Read more »

Geography and Data

June 15, 2009
By

The Economist recently ran a fascinating article about the emergence of geographical databases and their uses for presenting and analyzing data.All this has made it much easier to create maps that explain—at a glance—something that might otherwise require pages of tables or verbiage. “A percentage or a table is still abstract for people,” says Dan Newman of MAPLight.org,...

Read more »

Side by side analyses in Stata, SPSS, SAS, and R

June 15, 2009
By

I've linked to UCLA's stat computing resources once before on a previous post about choosing the right analysis for the questions your asking and the data types you have. Here's another section of the same website that has code to run an identical analysis in all of these statistical packages, with examples to walk through (as they note...

Read more »

Replacing 0 with NA – an evergreen from the list

June 15, 2009
By
Replacing 0 with NA – an evergreen from the list

This thread from the R-help list describe an evergreen tip that, at least once, is proved useful in R practice.

Read more »

Example 7.2: Simulate data from a logistic regression

June 13, 2009
By
Example 7.2: Simulate data from a logistic regression

It might be useful to be able to simulate data from a logistic regression (section 4.1.1). Our process is to generate the linear predictor, then apply the inverse link, and finally draw from a distribution with this parameter. This approach is useful in that it can easily be applied to other generalized linear models. In this...

Read more »

Example 7.1: Create a Fibonacci sequence

June 12, 2009
By
Example 7.1: Create a Fibonacci sequence

The Fibonacci numbers have many mathematical relationships and have been discovered repeatedly in nature. They are constructed as the sum of the previous two values, initialized with the values 1 and 1.A pdf of this example is available here.SASIn SAS, we use the lag function (section 1.4.17,...

Read more »

R tips: Installing Rmpi on Fedora Linux

June 12, 2009
By
R tips: Installing Rmpi on Fedora Linux

Somebody on the R-help mailing list asked how to get Rmpi working on his Fedora Linux machine so he could do high-performance computing on a cluster of machines (or a single multicore machine) using the R statistical computing and analysis platform. Since it is unusually painful to get working, I might as well copy the instructions...

Read more »

R tips: Installing Rmpi on Fedora Linux

June 12, 2009
By
R tips: Installing Rmpi on Fedora Linux

Somebody on the R-help mailing list asked how to get Rmpi working on his Fedora Linux machine so he could do high-performance computing on a cluster of machines (or a single multicore machine) using the R statistical computing and analysis platform. Since it is unusually painful to get working, I might as well copy the instructions...

Read more »

Simulation of Burning Fire in R

June 11, 2009
By
Simulation of Burning Fire in R

inlin Yan posted a cool (hot?) simulation of burning fire with R in the COS forum yesterday, which was indeed a warm welcome. I’m not sure whether our forum members will be scared by the “fire” under the title “Welcome to COS Forum”. The fire was mainly created by the function image() with

Read more »