The Joint Statistical Meeting in Montreal has proven to be very good. Here are a few highlight from Tuesday’s sessions. There is one major problem that exists and that is there are too many good sessions to attend. During one time block I had six session that I wanted to go to. Unfortunately, it is simply not possible to make it to all of them. However, the reoccurring theme is that if you don’t know at least R then you will quickly be left in the dust. Based on the sessions so far knowing R is a must and knowing other languages such as Java, Scala or Python will certainly be good.
Session on Analytics and Data Visualization in Professional Sports
During the morning I attended a session on statistics in sports. It was mostly several sabermetric presentations with some basketball in there too. One presentation caught my attention due to the granularity of the data that the presenter used. Benjamin Baumer’s R package is called openWAR which is an open source version of WAR (Wins Above Replacement) in baseball. With the data that package accesses it is able to identify every play as well as the spatial location of where the batter hit the ball on the field. If someone is interested in sports statistics or just interested in playing with a lot of publicly available data then openWAR is great resource (currently available on GitHub at https://github.com/beanumber/openWAR). This presentation also discussed the distribution of the players on the field and their ability to field the ball once it was hit. A different presentation from Sportvision presented on the location and trajectory of the ball as the pitcher throws the ball. Sportvision also shows the location in the strike zone of where the batter hits the ball the hardest. They are the same company that do the 1st & 10 graphics (i.e. the yellow line needed for a 1st down).
Session on Statistical Computing: Software and Graphics
I attended the Statistical Computing session and 5 of the 6 presentations were on R packages. The first was a presentation on Muste the R implementation of Survo. I have not used Survo before but I will certainly do some research into it. The next presentation was by Stephan Ritter the maintainer for the relaxnet and widenet. The third presentation was by David Dahl, the maintainer for jvmr. With this package one can integrate Scala and Java into R without any special compilation. TIBCO Spotfire then presented the TIBCO Enterprise Runtime for R (TERR). This looks to be an interesting solution to some of the data management issues that exist in R. The presenter indicated that it does a very good job at managing the system resources. The fifth presentation discussed the Rcpp package and the final presentation by Christopher Fonnesbeck was on PyMC which allows a user to perform Bayesian statistical analysis in Python.