New material on the GGobi book web page
Solutions to the exercises in the back of each chapter are available for instructors by emailing Springer.
The movies on the book web site are currently being updated to include sound.
Shading overlapping area of curves in R
Heuristics for statistics
SIMPLE WAYS TO DETECT AND COMMUNICATE STATISTICAL EFFECTS

Decision Science News is fond of heuristics and the Simonian view that for many problems organisms face, optimization is a fiction and satisficing makes us smart. Statistics is an area in which it is easy to see precision that isn’t there and find “optima” in problems that lack them. It can be refreshing to look at a problem in a simplified form to get a feeling for what is going on before obsessing over insignificant digits.
Andrew Gelman is previewing a few working papers on rules of thumb that make it easy to detect and communicate statistical effects. “Recommended reading,” says Decision Science News, quoting itself.
Mini Talk on Simple Statistical Methods
Splitting a predictor at the upper quarter or third and the lower quarter or third
Scaling regression inputs by dividing by two standard deviations
Photo credit: http://www.flickr.com/photo_zoom.gne?id=327299636

Hack-at-it 2007

The GGobi Hack-at-it 2007 was held just before useR! 2007 in Ames Iowa.
The main focus this year was on the data pipeline again, and on pipelines in other software.
Several projects were discussed: the geometric shapes, high-dimensional games, local neighborhood brushing, metabolomics data preprocessing.
The first business meeting of the GGobi Foundation was held, too, working on the by-laws, and other operating procedures.


ERGMs in R
Developers of statnet, a collection of packages for R for fitting Exponential Random Graph Models (ERGM), issued a major update. First change is that the main package is now called ergm. Secondly, a set of additional packages has been made available. Apart from package network, that provides a class system for relational data on which statnet relies, there are couple of new ones, for example rSonia and dynamicnetwork facilitating work with the SONIA visualizer of network dynamics, but also many more. And, last but not least, thirdly, (almost) all of them are now available on CRAN websites.
Another interesting news is the forthcoming special issue of Journal of Statistical Software which is going to be devoted to these new developments. Preliminary versions of the articles are already available on statnet website. I especially welcome the paper by Carter Butts that throughly explains the functionality of the network package that has been available already for some time, but scarce on-line documentation made the conscious use very difficult.
Resources for S4 classes and methods
Learning the programming in the new S4 system of classes and methods in R can be quite cumbersome, even though the methods package is very well documented. That is why I collected some of the info and materials that I am aware of on a separate page here. I warmly welcome any suggestions for extending this, for now short, collection!
Where to look for information on programming with S4 system of classes and methods for R? Here is a open-ended collection of links to resources on learning programming in S4 system. Suggestions more than welcome!
Books
- John M. Chambers. Programming with Data. Springer, New York, 1998. ISBN 0-387-98503-4. Some more info here http://cm.bell-labs.com/cm/ms/departments/sia/Sbook/
- John M. Chambers. Software for Data Analysis: Programming with R. Springer, New York, 2008. ISBN 0387759352, 9780387759357.
Articles and notes
- Short note by John Chambers “Classes and Methods in the S Language” [PDF] www.omegahat.org/RSMethods/Intro.pdf
- Another note by John Chambers “S4 Classes in 15 pages, more or less” [PDF]www.stat.ucla.edu/~cocteau/stat202a/resources/docs/S4Objects.pdf
Presentation slides
- Slides by Friedrich Leisch from UseR! 2004 conference in Vienna [PDF] http://www.ci.tuwien.ac.at/Conferences/useR-2004/Keynotes/Leisch.pdf
Links
- On R Wiki http://wiki.r-project.org/rwiki/doku.php?id=tips:classes-s4
- http://www.omegahat.org/RSMethods/
Packages
Sometimes it is worthwhile to look at the source code of the available packages and learn from that. Here is are the packages that can be instructive: stats4, lme4.

GillespieSSA 0.5-1 is released
I just uploaded GillespieSSA 0.5-1 to CRAN. Now it’s just a matter of days before it has propagated itself across all CRAN mirrors. This version consists primarily of revisions I made in response to the reviewer comments on the paper where the package is introduced (submitted to the Journal of Statistical Software). There are some minor changes in the functionality of the ssa.plot() function but otherwise the changes consists entirely of buglet fixes and improvements to the documentation.
One of the more interesting comments I got addressed the use of a character vector to pass the propensity functions to the wrapper function ssa(). For example, normally one would define the propensity functions for a logistic growth model as
| a <- c("c1*Y1", "c2*Y1*Y2","c3*Y2") |
This is the way it would be defined if the simulation is invoked using the higher level wrapper function ssa(). One can, however, also pass the propensity vector as a function by directly invoking the lower-level method function (ssa.d, ssa.btl, ssa.etl, ssa.otl, …). For the logistic growth model this could be done like so,
| a = function(parms,x){ b <- parms[1] d <- parms[2] K <- parms[3] N <- x[1] return(c(b*N , N*b + (b-d)*N/K)) } parms <- c(2,1,1000,500) x <- 500 nu <- matrix(c(+1, -1),ncol=2) t <- 0 for (i in seq(100)) { out <- ssa.d(a(parms,x),nu) x <- x + out$nu_j t <- t + 1 cat("t:",t,", x:",x,"\n") } ) |
The obvious advantage of this approach is that the propensity vector is simpler to define and maintain throughout the simulation. It is also likely that the simulation would run faster (or at least being simpler to optimize) without the extra over head imposed by the higher-level wrapper function. The disadvantage is that setting up the simulation is a tad more involved since one has to “manually” update the state vector, time variable and collect the output data (which, by the way, the above routine does not do).
Perhaps the most interesting consequence of directly invoking the lower-level method functions is that one now can vary the parameters of the model during the simulation allowing for temporal environmental heterogeneity, e.g. varying vital rates and carrying capacity over time.
Quantile regression in R
Roger Koenker, a quantile regression crusader, has an R package that implements the procedure. It is called quantreg, and it is documented here. This package has apparently been around for quite some time, but I was only recently turned on to quantile regression, so it was under my radar.
