Return

July 6, 2009
By

I'm back from vacation, so I'll post something substantive later today.

Read more »

Return

July 6, 2009
By

I'm back from vacation, so I'll post something substantive later today.

Read more »

Using R to Create Misc. Patterns [smocking]

July 4, 2009
By
Using R to Create Misc. Patterns [smocking]

Pattern Chunk   Premise My wife asked me to come up with some graph paper for creating smocking patterns. After a couple of minutes playing around with R-base graphics functions, it occurred to me that several functions in the sp package...

Read more »

Summarizing Grouped Data in R

July 3, 2009
By

A colleague of mine recently asked about computing basic summary statistics from grouped data in R. These are a couple examples that I suggested. Additional documentation for the plyr package can be found here. read more

Read more »

Remove files with a specific pattern in R

July 3, 2009
By
Remove files with a specific pattern in R

A quick basic tip which can come in handy whether you need to rapidly remove files from a directory:junk <- dir(path="your_path", pattern="your_pattern") # ?dirfile.remove(junk) # ?file.removeClearly, for advanced needs, you can use system() and al...

Read more »

OECD Statistics

July 2, 2009
By
OECD Statistics

I am a sucker for good quality data. I wrote about data.gov, the US Government data site before, and now I find OECD Statistics which has some 300 data sets, many of which seems to be readily accessible (though some may require subscription)

Read more »

OECD Statistics

July 2, 2009
By
OECD Statistics

I am a sucker for good quality data. I wrote about data.gov, the US Government data site before, and now I find OECD Statistics which has some 300 data sets, many of which seems to be readily accessible (though some may require subscription)

Read more »

Example 7.4: A prettier jittered scatterplot

July 2, 2009
By
Example 7.4: A prettier jittered scatterplot

The plot in section 7.3 has some problems. At the very least, the jittered values ought to be between 0 and 1, so the smoothed lines fit better with them. Once again we use the data generated in section 7.2 as an example. For both SAS and R, we use conditioning (section 1.11.2) to make the jitter happen...

Read more »

R String processing

July 2, 2009
By
R String processing

Here's a little vignette of data munging using the regular expression facilities of R (aka the R-project for statistical computing). Let's say I have a vector of strings that looks like this:> coords "chromosome+:157470-158370" "chromosome+:1583...

Read more »

Getting help with R

July 2, 2009
By

There's no doubt that by now you've noticed that we're big fans of R around here. It's completely free, has superior graphing capabilities, and with all the extension packages available there isn't much it can't do. One of the problems with R especially to new users is that it isn't obvious how to find help when you...

Read more »

PDQ 5.0 Test Suite or … How I Spent My Weekend

June 29, 2009
By
PDQ 5.0 Test Suite or … How I Spent My Weekend

I was planning to blog about the amazing time I had at Velocity 2009 last week, when this landed in my mailbox (edited for space and privacy): Subject: Seeking help with PDQ-R ...Date: Thu, 25 Jun 2009 15:51:21 -0500My name is James and I've be...

Read more »

August Guerrilla Class: Using R for Performance Analysis

June 29, 2009
By
August Guerrilla Class: Using R for Performance Analysis

Registrations are still open for the Guerrilla Data Analysis Techniques (GDAT) class being held August 10-14, 2009. The focus will be on using R and the new release of PDQ-R for performance analysis and capacity planning.All Guerrilla classes are hel...

Read more »

Time series data

June 28, 2009
By
Time series data

gdp attach(gdp)as.Date(date)plot(gdp~date, data=gdp,pch=16,xlab="",ylab="GDP (2000 dollars)")

Read more »

Time series data

June 28, 2009
By
Time series data

gdp attach(gdp)as.Date(date)plot(gdp~date, data=gdp,pch=16,xlab="",ylab="GDP (2000 dollars)")

Read more »

RSI(2) Evaluation

June 28, 2009
By
RSI(2) Evaluation

Despite my best efforts, it's been a month since the last post of this series. The first post replicated this simple RSI(2) strategy from the MarketSci Blog using R. The second post showed how to replicate the strategy that scales in/out of RSI(2). ...

Read more »

Conservatism of Congressional delegation and %Bush vote

June 27, 2009
By
Conservatism of Congressional delegation and %Bush vote

Busy day today, so I'll just post this:plot(bush04 ~ cons_hr, type = "n",xlab="Mean ACU rating",ylab="2004 Bush vote",xlim=c(0,100),ylim=c(0,100),cex.lab=1.25,cex.axis=0.75,col.axis = "#777777",col.lab = "#777777")text(y=bush04,x=cons_hr, labels=statei...

Read more »

Conservatism of Congressional delegation and %Bush vote

June 27, 2009
By
Conservatism of Congressional delegation and %Bush vote

Busy day today, so I'll just post this:plot(bush04 ~ cons_hr, type = "n",xlab="Mean ACU rating",ylab="2004 Bush vote",xlim=c(0,100),ylim=c(0,100),cex.lab=1.25,cex.axis=0.75,col.axis = "#777777",col.lab = "#777777")text(y=bush04,x=cons_hr, labels=statei...

Read more »

R 2.9.1, CRANberries outage, and missing Java support

June 27, 2009
By

Just a short note that version 2.9.1 of R was released yesterday. And a corresponding Debian release went out as usual on the same day. One sour note: as the Java toolchain is currently broken, I had to disable compile-time support for Java. Just run R...

Read more »

R 2.9.1, CRANberries outage, and missing Java support

June 27, 2009
By

Just a short note that version 2.9.1 of R was released yesterday. And a corresponding Debian release went out as usual on the same day. One sour note: as the Java toolchain is currently broken, I had to disable compile-time support for Java. Just run R CMD javareconf once installed if you need it. Speaking of broken, I had...

Read more »

Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box

June 26, 2009
By

Our article (by Yu-Sung, Jennifer, Masanao, and myself, and based also on work with Kobi, Grazia, and Peter Messeri) will be appearing in the Journal of Statistical Software, in a special issue on missing-data imputation. Here's the abstract: ...

Read more »

Filtering cases

June 26, 2009
By
Filtering cases

Something that's very important to be able to do in data analysis and visualization is to filter out cases. Let's say you want to do identical analyses of two different groups, or of one group and then a subset of it. R can do this a little differently; instead of merely filtering out cases you can create an object...

Read more »

Filtering cases

June 26, 2009
By
Filtering cases

Something that's very important to be able to do in data analysis and visualization is to filter out cases. Let's say you want to do identical analyses of two different groups, or of one group and then a subset of it. R can do this a little differently; instead of merely filtering out cases you can create an object...

Read more »

Development of tikzDevice is underway

June 26, 2009
By

Development of the R package tikzDevice has been underway for about a month now. This package allows for the output of R graphics as TikZ commands. Charlie Sharpsteen and I have gotten it into an alpha stage. There is no real documentation but there is plenty of comments in the code. We have a R-forge

Read more »

Set the significant digits for each column in a xtable for fancy Sweave output

June 26, 2009
By
Set the significant digits for each column in a xtable for fancy Sweave output

This tip may be useful in the situations when you need to set the number of digits to print for the different columns in a matrix/data.frame to be outputted as a LaTeX table.  For example: #install.packages("xtable") #library(xtable) tmp <- m...

Read more »

A bit about linear models

June 26, 2009
By

Before we delve into slightly more advanced plotting commands I want to talk a little about linear models, specifically, linear regression. In R this is very, very simple. For instance, in our 'states' data frame, we might want to look at median household income as a predictor of state education expenditures. The command lm calculates this for us. We'll...

Read more »

A bit about linear models

June 26, 2009
By

Before we delve into slightly more advanced plotting commands I want to talk a little about linear models, specifically, linear regression. In R this is very, very simple. For instance, in our 'states' data frame, we might want to look at median household income as a predictor of state education expenditures. The command lm calculates this for us. We'll...

Read more »

Reading data, and a graph

June 25, 2009
By
Reading data, and a graph

Using Microsoft Excel I'm collecting aggregate data, by state, of various social, political, and economic indicators. I export them into a tab-delimited file called 'states.txt' (pretty clever, I know.) I've got data on education expenditures, firearm deaths per capita, median household income, etc. I'd like to do some analysis and graphing of these data to see if there are...

Read more »

Reading data, and a graph

June 25, 2009
By
Reading data, and a graph

Using Microsoft Excel I'm collecting aggregate data, by state, of various social, political, and economic indicators. I export them into a tab-delimited file called 'states.txt' (pretty clever, I know.) I've got data on education expenditures, firearm deaths per capita, median household income, etc. I'd like to do some analysis and graphing of these data to see if there are...

Read more »

Delete a List Component in R

June 24, 2009
By

In R, the way to delete a component in a list object is different from matrix and vector objects. For a vector, to delete an element:vec <- c(1, 2, 3)vec <- vecFor a matrix, to delete a row or a column:mat <- matrix(c(1,2,3,4), 2, 2)mat2 <- mat # delete a rowmat3 <- mat # delete a columnFor a list,...

Read more »