Data Community DC and District Data Labs are hosting a Supervised Machine Learning with R workshop on Saturday April 30th. Come out and learn about R’s capabilities for regression and classification, how to perform inference with these models, and how to use out-of-sample evaluation methods for your models!
R is a powerful language for statistical computing. A prolific user community backs R with with an extensive library of packages. If you can think of it, somebody has already written a library for it. R also has a superb IDE, R Studio, facilitating reproducible research.
This course is for people with some R programming experience. It gives an overview of supervised statistical modeling and machine learning in R. We will focus on a small subset of algorithms and emphasize out-of-sample evaluation.
WHAT YOU WILL LEARN
This course introduces R capabilities for regression and classification. Many machine learning algorithms exist and it is only possible to cover a small subset in a single class. We will focus on:
- Linear and logistic regression
- Decision tree and SVM classifiers
- Training sets and test sets
- K-fold cross-validation
- Prediction vs. inference
The workshop will cover the following:
- Setting up an R Studio Project and file structure.
- Review of R, R Studio
- CRAN task view: machine learning
- Training, testing, and k-fold cross validation
- Decision trees and random forests
- Support vector machines
- General linear models, focusing on logistic regression
- Linear regression models
After this course you will have used several supervised machine learning methods. You will understand how to use out-of-sample evaluation methods for your models. Where possible, you will learn to perform inference with these models.