Vivian Zhang and NYC-Open-Data will be hosting Max Kuhn’s event and do live streaming from 7pm to 9pm in nyc on Feb 18th:
The link for live streaming will be available on twitter @NycDataSci around 6:30pm tmr.
Max Kuhn, Director is Nonclinical Statistics of Pfizer and also the author of Applied Predictive Modeling.
He will join us and share his experience with Data Mining with R.
Max is a nonclinical statistician who has been applying predictive models in the diagnostic and pharmaceutical industries for over 15 years. He is the author and maintainer for a number of predictive modeling packages, including: caret, C50, Cubist and AppliedPredictiveModeling. He blogs about the practice of modeling on his website at ttp://appliedpredictivemodeling.com/blog
His Feb 18th course can be RSVP at NYC Data Science Academy.
Predictive Modeling using R
This class will get attendees up to speed in predictive modeling using the R programming language. The goal of the course is to understand the general predictive modeling process and how it can be implemented in R. A selection of important models (e.g. tree-based models, support vector machines) will be described in an intuitive manner to illustrate the process of training and evaluating models.
Attendees should have a working knowledge of basic R data structures (e.g. data frames, factors etc) and language fundamentals such as functions and subsetting data. Understanding of the content contained in Appendix B sections B1 though B8 of Applied Predictive Modeling (free PDF from publisher ) should suffice.
– An introduction to predictive modeling
– R and predictive modeling: the good and bad
– Illustrative example
– Measuring performance
– Data splitting and resampling
– Data pre-processing
– Classification trees
– Boosted trees
– Support vector machines
If time allows, the following topics will also be covered
– Parallel processing
– Comparing models
– Feature selection
– Common pitfalls
Attendees will be provided with a copy of Applied Predictive Modeling as well as course notes, code and raw data. Participants will be able to reproduce the examples described in the workshop.
Attendees should have a computer with a relatively recent version of R installed.
About the Instructor:
More about Max’s work: