## R Project and Google Summer of Code: Welcome to our students!

April 26, 2010
By

A few hours ago, I sent the following to both the R development list and the informal R / GSoC list: Date: Mon, 26 Apr 2010 15:27:29 -0500 To: R Development List CC: gsoc-r Subject: R and the Google Summer of Code 2010 -- Please welcome our new st...

## R Project and Google Summer of Code: Welcome to our students!

April 26, 2010
By

A few hours ago, I sent the following to both the R development list and the informal R / GSoC list: Date: Mon, 26 Apr 2010 15:27:29 -0500 To: R Development List CC: gsoc-r Subject: R and the Google Summer of Code 2010 -- Please welcome our new stud...

## R Project and Google Summer of Code: Welcome to our students!

April 26, 2010
By

A few hours ago, I sent the following to both the R development list and the informal R / GSoC list: Date: Mon, 26 Apr 2010 15:27:29 -0500 To: R Development List CC: gsoc-r Subject: R and the Google Summer of Code 2010 -- Please welcome our new st...

## R is going to have a GUI to ggplot2! (by the end of this years google-summer-of-code)

April 26, 2010
By

I was delighted to see the following e-mail post from Dirk Eddelbuettel regarding the google-summer-of-code R google group: * * * Earlier today Google finalised student / mentor pairings and allocations for the Google Summer of Code 2010 (GSoC 2010). The R Project is happy to announce that the following students have been accepted: Colin Rundel, “rgeos – an...

## R Project websites down

April 26, 2010
By

The main R Project website, www.r-project.org, many of the primary CRAN mirrors (including cran.r-project.org), and r-forge.r-project.org are currently unavailable following a power failure at the master host in Austria. The US CRAN mirror, cran.us.r-project.org, and other mirrors not under the r-project.org domain (including cran.revolution-computing.com) are still accessible. In addition, the following services are not affected: bugs.r-project.org, developer.r-project.org, ess.r-project.org, search.r-project.org,...

## A serial Connection for R

April 26, 2010
By

***UPDATED: June 3, 2010 – revert name from “tty” to “serial“, R version (2.11.1)*** I’m working on a patch for R (currently 2.11.1) that adds a serial connection feature for POSIX systems (i.e. Linux, Mac OS X, …). The serial connection works like the other connections. For example, the following code opens, writes a single

## Example 7.34: Propensity scores and causal inference from observational studies

April 26, 2010
By

Propensity scores can be used to help make causal interpretation of observational data more plausible, by adjusting for other factors that may responsible for differences between groups. Heuristically, we estimate the probability of exposure, rather t...

## R : NA vs. NULL

April 25, 2010
By

The R language has two closely related NULL-like values, NA and NULL ... Both are used to represent missing or undefined values. This has lead to much confusion.

## Summarising data using box and whisker plots

April 25, 2010
By

A box and whisker plot is a type of graphical display that can be used to summarise a set of data based on the five number summary of this data. The summary statistics used to create a box and whisker plot are the median of the data, the lower and upper quartiles (25% and 75%)

## How to upgrade R on windows – another strategy (and the R code to do it)

April 23, 2010
By

Update: In the end of the post I added simple step by step instruction on how to move to the new system. I STRONGLY suggest using the code only after you read the entire post. Background If you didn’t hear it by now – R 2.11.0 is out with a bunch of new features. After Andrew Gelman recently lamented the lack...

## Some LaTeX Gems – Part 1: TikZ, Loops and more

April 23, 2010
By

This logo means that the blog post is about something I have found interesting, but does not apply directly to the exact purpose of this blog. Note: These commands have been tested in pdflatex. I am not sure if they work in other distributions. Over the past couple of months, I have been assisting with editing some papers and also doing...

## Because it’s Friday: Four chords, and the truth

April 23, 2010
By

This one's for the musicians out there. (By the way, in my purely anecdotal experience, musical aptitude appears to have a higher-then-expected representation amongst stats folks. I however am the exception that proves the rule, as anyone who's suffered through my Rock Band vocals can attest. But I digress.) What do the chords C#minor, A, E and B have...

## R/Finance 2010 … and unicorns

April 23, 2010
By

At the Information Management blogs, Steve Miller has posted a great roundup of last weekend's R/Finance 2010 conference in Chicago. Here's Steve's overall take: This year's conference was even better than the 2009 inaugural, the in-excess-of-200 participants consumed by more than 20 consecutive high-powered presentations over the fast-paced day and a half. And while I'm a quantitative finance welterweight...

## R 2.11.0 just landed…

April 23, 2010
By

The new version is here. R version 2.11.0 has been released on 2010-04-22. The source code is first available in this directory, and eventually via all of CRAN. Binaries will arrive in due course (see download instructions above).

## Top 10 Algorithms in Data Mining

April 23, 2010
By

The authors here invited ACM KDD Innovation Award and IEEE ICDM Research Contributions Award winners to each nominate up to 10 best-known algorithms in data mining, including the algorithm name, justification for nomination, and a representative public...

## Trouble with ESS and Sweave

April 23, 2010
By

Last time I tried to sweave a document from with Emacs+ESS, I was using an earlier version of ESS (the current version is 5.8), and things seemed to be fine. Today when I tried to sweave a simple document and produced PDF output, I got error message of...

## Simple Linear Regression

April 23, 2010
By

One of the most frequent used techniques in statistics is linear regression where we investigate the potential relationship between a variable of interest (often called the response variable but there are many other names in use) and a set of one of more variables (known as the independent variables or some other term). Unsurprisingly there

## Fun with the Vasicek Interest Rate Model

April 22, 2010
By
$Fun with the Vasicek Interest Rate Model$

A common model used in the financial industry for modelling the short rate (think overnight rate, but actually an infinitesimally short amount of time) is the Vasicek model. Although it is unlikely to perfectly fit the yield curve, it has some nice properties that make it a good model to work with. The

## The Bernoulli factory

April 22, 2010
By
$The Bernoulli factory$

A few months ago, Latuszyński, Kosmidis, Papaspiliopoulos and Roberts arXived a paper I should have noticed earlier as its topic is very much related to our paper with Randal Douc on the vanilla Rao-Blackwellisation scheme. It is motivated by the Bernoulli factory problem, which aims at (unbiasedly) estimating f(p) from an iid sequence of Bernoulli

## New R User Group in San Diego

April 22, 2010
By

There's a new local R User Group in San Diego (CA, USA), and they're meeting tonight. If you're in the area, why not RSVP and come along? The topic looks great: Our speaker, Scott Wallihan, will be covering how to expand R's functionality through custom packages. This topic will be covered over two meetings. In our April meeting, we...

## R 2.11.0 released

April 22, 2010
By

The latest version of R from the R Project, R 2.11.0, is now available in source code form. Binaries for Windows, Mac and Linux will appear in your local CRAN mirror in the next few days. Some new features include: Support for rendering bitmap images in graphics devices, via a new function rasterImage() The new function vapply is like...

## R 2.11.0 is released!

April 22, 2010
By

The new R 2.11.0 is out! Get it from here. Take a look at these posts for some miscellaneous advices to make the upgrade easier. Also this thread on stackoverflow can be of some value. Feel free to contribute with suggestions about how to upgrade your ...

## R 2.11.0 is released!

April 22, 2010
By

The new R 2.11.0 is out! Get it from here. Take a look at these posts for some miscellaneous advices to make the upgrade easier. Also this thread on stackoverflow can be of some value. Feel free to contribute with suggestions about how to upgrade your ...

## The difference between “letters[c(1,NA)]” and “letters[c(NA,NA)]“

April 22, 2010
By

In David Smith’s latest blog post (which, in a sense, is a continued response to the latest public attack on R), there was a comment by Barry that caught my eye. Barry wrote: Even I get caught out on R quirks after 20 years of using it. Compare letters and letters for the most recent thing that made me...

## Free Video Courses on R, Structural Equation Modelling, Causal Inference, and Regression from Uni Jena

April 22, 2010
By

The Department of methodology and Evaluation Research at Universität Jena has made available a set of free online video courses on data analysis.They cover topics that are particularly relevant to psychology and social science researchers, including ...

## R: more plotting fun, this time with the Poisson

April 21, 2010
By

Click on image for a larger version. Here is the code: par(bg="black") par(mar=c(0,0,0,0)) plot(sort(rpois(10000,100))/rpois(10000,100),frame.plot=F,pch=20,col="blue")

## Automated way to check for PGF version

April 21, 2010
By

This is one way to check for the version of PGF that is installed in an automated way. First create a tex file with the following contents: \documentclass{article} \usepackage{tikz} \batchmode \makeatletter \typeout{PGFVersion=\pgfversion} \@@end Say you named it test-pgf-version.tex. Then: pdflatex test-pgf-version.tex cat test-pgf-verson.log | grep PGFVersion | sed ‘s/PGFVersion=//’ should display the version number. I

## Why use R? Because that’s what the pros use

April 21, 2010
By

I had the great pleasure of sitting down for a beer with Steve O'Grady (from the open-source analyst group RedMonk), at the MySQL conference last week. It was great to get the perspective of someone who knows the tech industry so well, sees predictive analytics as a hot area, and is taking an active interest in statistics and R...

## Doing Maximum Likelihood Estimation by Hand in R

April 21, 2010
By

Lately I’ve been writing maximum likelihood estimation code by hand for some economic models that I’m working with. It’s actually a fairly simple task, so I thought that I would write up the basic approach in case there are readers who haven’t built a generic estimation system before. First, let’s start with a toy example