Pairs of categorical data
The grades data.frame holds two columns of letter grades, giving pairs of categorical data, like so:
prev grade
1 B+ B+
2 A- A-
3 B+ A-
...
122 B B
This type...

I'm working my way through Using R for Introductory Statistics, by John Verzani, a free version of which is available as SimpleR.
Chapter 1
...covers basics of R such as arithmetic, loading libraries and reading data. We also get an introduction to v...

R is a weird beast. Through it's ancestor the S language, it claims a proud heritage reaching back to Bell Labs in the 1970's when S was created as an interactive wrapper around a set of statistical and numerical subroutines. As a programming language,...

A common data-munging operation is to compute cross tabulations of measurements by categories. SQL Server and Excel have a nice feature called pivot tables for this purpose. Here we'll figure out how to do pivot operations in R.Let's imagine an experim...

The R statistical computing environment is awesome, but weird. How to do database operations in R is a common source of questions. The other day I was looking for an equivalent to SQL group by for R data frames. You need this to compute summary statist...

Want to join two R data frames on a common key? Here's one way do a SQL database style join operation in R.We start with a data frame describing probes on a microarray. The key is the probe_id and the rest of the information describes the location on t...

Here's another quick R vignette, in case I pick this up later and need to remind myself where I got stuck. I was trying to use R for a bit of basic sequence analysis, with mixed results.First, install the BSgenome package, which is part of Bioconductor...

The R language is weird - particularly for those coming from a typical programmer's background, which likely includes OO languages in the curly-brace family and relational databases using SQL. A key data structure in R, the data.frame, is used somethin...

NCBI's GEO database of gene expression data is a great resource, but its records are very open ended. This lack of rigidity was perhaps necessary to accommodate the variety of measurement technologies, but makes getting data out a little tricky. But, a...

Here's a little vignette of data munging using the regular expression facilities of R (aka the R-project for statistical computing). Let's say I have a vector of strings that looks like this:> coords
"chromosome+:157470-158370" "chromosome+:1583...