Blog Archives

Data manipulations

August 23, 2011
By

In the last Utah R Users group meeting I gave a presentation on data manipulations on R, and today I found through the plyr mailing list two commands that I was previously unaware of that should definitely be made mention of, arrage and mutate.

Read more »

Using a “pure infographic” to explore differences between information visualization and statistical graphics

August 10, 2011
By
Using a “pure infographic” to explore differences between information visualization and statistical graphics

Our discussion on data visualization continues. One one side are three statisticians–Antony Unwin, Kaiser Fung, and myself. We have been writing about the different goals served by information visualization and statistical graphics. On the other side are graphics experts (sorry for the imprecision, I don’t know exactly what these people do in their day jobs The post Using...

Read more »

NppToR 2.6.0 beta 2

July 29, 2011
By

http://sourceforge.net/projects/npptor/files/npptor%20installer/NppToR-2.6.0.beta2.exe/download I’ve released beta 2 of NppToR 2.6.0.  Please take a look and report any problems.  This improves the installer and the uninstaller as well as a few bugs that popped up from the transition to UNICODE.

Read more »

Infovis vs. statgraphics: A clear example of their different goals

July 29, 2011
By
Infovis vs. statgraphics:  A clear example of their different goals

I recently came across a data visualization that perfectly demonstrates the difference between the “infovis” and “statgraphics” perspectives. Here’s the image (link from Tyler Cowen): That’s the infovis. The statgraphic version would simply be a dotplot, something like this: (I purposely used the default settings in R with only minor modifications here to demonstrate what The post Infovis...

Read more »

Looking for NppToR beta testers.

July 19, 2011
By

NppToR 2.6 is coming with improved flexibility and speed. Testers needed before setting as default.

Read more »

R on the cloud

July 9, 2011
By

Just as scientists should never really have to think much about statistics, I feel that, in an ideal world, statisticians would never have to worry about computing. In the real world, though, we have to spend a lot of time building our own tools.It would be great if we could routinely run R with speed and memory limitations...

Read more »

Blog in motion

July 8, 2011
By

In the next few days we’ll be changing the format of the blog and moving it to a new server. If you have difficulty posting comments, just wait and post them in a few days when all should be working well. (But if you can post a comment, go for it. All the old entries

Read more »

The virtues of incoherence?

July 8, 2011
By

Kent Osband writes:

Read more »

Descriptive statistics, causal inference, and story time

July 7, 2011
By

Dave Backus points me to this review by anthropologist Mike McGovern of two books by economist Paul Collier on the politics of economic development in Africa. My first reaction was that this was interesting but non-statistical so I’d have to either post it on the sister blog or wait until the 30 days of statistics

Read more »

Early stopping and penalized likelihood

July 6, 2011
By
Early stopping and penalized likelihood

Maximum likelihood gives the beat fit to the training data but in general overfits, yielding overly-noisy parameter estimates that don't perform so well when predicting new data. A popular solution to this overfitting problem takes advantage of the iterative nature of most maximum likelihood algorithms by stopping early. In general, an iterative optimization algorithm goes from a...

Read more »