A Data Scientist’s and R User’s Guide to the JSM

July 31, 2014

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

by Joseph Rickert

The Joint Statistical Meetings (JSM) get underway this weekend in Boston and Revolution Analytics is again proud to be a sponsor. More than 6,000 statisticians and data scientists from around the world are expected to attend and listen to thousands of presentations. It is true that many talks will be on specialized topics that only statisticians working in particular a field will have the interest and patience to sit through. However, there is evidence that the conference will have something exciting to offer data scientists and statisticians working in industry. Keyword searches yield 79 presentations for Big Data, 29 on Machine Learning, 17 on Data Science, 17 on Data Mining and 19 related to R. There is more than enough here to fill a data scientist’s dance card.

Three must-see presentations under the Big Data keyword are: Michael Franklin's presentation on Analyzing Data at Scale with the Berkeley Data Analytics Stack; Hui Jiang et al. on Implementation of Statistical Algorithms in Big Data Platforms and Tim Hesterberg's talk on Simulation-Based Methods in Statistics Education, and Google Tools. Under the Data Science label, Bill Ruh’s invited talk Industrial Internet, an Opportunity for Statisticians to Become Data Scientists looks most inviting. There are also quite a few Data Science talks that indicated some soul searching within the academic community as to how the statistics curriculum ought to be changed. See, for example, Michael Rappa’s talk on Data Scientists: How Do We Prepare for the Future? and Johanna Hardin’s talk: Data Science and Statistics: How Should They Fit into Our Curriculum?

Here is the list of R related presentations:

Saturday, August 2

  1. 8:00 AM - 12:00 PM: Adaptive Tests of Significance Using R and SAS — Professional Development Continuing Education Course ASA Instructor: Tom O'Gorman

Sunday, August 3

  1. 8:30 AM - 5:00 PM: Adaptive Methods in Modern Clinical Trials — Professional Development Continuing Education Course ASA , Biometrics Section Instructors: Frank Bretz, Byron Jones, and Guosheng Yin
  2. 4:20 PM: Glassbox: An R Package for Visualizing Algorithmic Models: Max Ghenis and Ben Ogorek and Estevan Flores
  3. 4:45 PM: Bayesian Enrollment and Event Predictions in Clinical Trials Leveraging Literature Data: Aijun Gao and Fanni Natanegara and Govinda Weerakkody

Monday, August 4

  1. 8:55 AM: Thinking with Data in the Second Course: Nicholas J. Horton and Ben S. Baumer and Hadley Wickham
  2. 8:30 AM to 10:20 AM: Do You See What I See? Formal Usability Testing and Statistical Graphics: Marie C. Vendettuoli and Matthew Williams and Susan Ruth VanderPlas
  3. 8:35 AM: Preparing Students for Big Data Using R and Rstudio: Randall Pruim
  4. 8:35 AM: Does R Provide What Customer Need?: Vipin Arora
  5. 8:55 AM: Doing Reporducible Research Unconscously: Higher Standard, but Less Work: Yihui Xie
  6. 12:30 PM: to 1:50 PM: Analyzing Umpire Performance Using PITCHf/x: Andrew Swift
  7. 3:30 PM: The Perfect Bracket: Machine Learning in NCAA Basketball: Sara Stoudt and Loren Santana and Ben S. Baumer

Tuesday, August 5

  1. 10:35 AM: Tools for Teaching R and Statistics Using Games Brad Luen and Michael Higgins
  2. 2:00 PM: Multiple Treatment Groups: A Case Study with Health Care Practice and Policy Implications Alexandra Hanlon and Karen Hirschman and Beth Ann Griffin and Mary Naylor
  3. 2:05 PM: glmmplus: An R Package for Messy Longitudinal Data Ben Ogorek and Caitlin Hogan
  4. 3:30 PM: Give Me an Old Computer, a Blank DVD, and an Internet Connection and I'll Give You World-Class Analytics Ty Henkaline

Wednesday, August 6

  1. 9:35 AM: Testing Packages for the R Language: Stephen Kaluzny and Lou Bajuk-Yorgan
  2. 9:50 AM: Using R Analytics on Streaming Data: Lou Bajuk-Yorgan and Stephen Kaluzny
  3. 10:35 Shiny: Easy Web Applications in R:Joseph Cheng
  4. 10:30 AM to 12:20 PM: Classroom Demonstrations of Big Data: Eric A. Suess
  5. 11:00 AM: ggvis: Moving Toward a Grammar of Interactive Graphics: Hadley Wickham
  6. 3:05 PM: Accessing Data from the Census Bureau API: Alex Shum and Heike Hofmann

Thursday, August 7

  1. 9:20 AM: Predicting Dangerous E. Coli Levels at Erie, Pennsylvania, Beaches with Random Forests in R: Michael Rutter
  2. 9:25 AM: Beyond the Black Box: Flexible Programming of Hierarchical Modeling Algorithms for BUGS-Compatible Models Using NIMBLE: Perry de de Valpine and Daniel Turek and Christopher J. Paciorek and Rastislav Bodik and Duncan Temple Lang

If you are going to JSM please come by booth #303 to say hello. You may also find the mobile apps (Apple or Android) that Revolution Analytics is sponsoring useful, and don't forget to fill out the survey for a chance to win an Apple TV.

Finally, I will be the program chair for Session 401, Monte Carlo Methods to be held Tuesday, 8/5/2014, from 2:00 PM to 3:50 PM in room CC-101. If you are interested in simulation be sure to drop in. I have seen the presentations and think they are well worth attending. 

To leave a comment for the author, please follow the link and comment on his blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.