NYC R Programming Classes – starting this coming Sunday

November 5, 2013

Guest post by Vivian Zhang, original post.

You can sign up for our Sunday Intensive beginner level R classes at
NYC Data Science Academy meetup page or email [email protected] for more info.

Brief: The course (which will meet five Sundays) will start from the basics,
introducing the building blocks used for programming in R and building
intuition for writing clean and robust code. We will move on to cover
data analysis, applications of statistical techniques, and graphing.

Date: Nov 10th, Nov 17th, Nov 24th, Dec 1st, Dec 8th (Five Sundays)

Time: 12:00pm to 4pm

Scott Kostyshak (Data Scientist @ Supstat Inc, 5th year Econ PhD at Princeton Univ.)
Vivian Zhang (CTO @Supstat Inc, Master degrees in Computer Science and Statistics)

Screen Shot 2013-11-04 at 5.16.48 PMScreen Shot 2013-11-04 at 5.10.04 PM

Individual: $110/class
For group(5 or more persons) and enterprise pricing, please email [email protected]

Course Outline:

(Content may be adjusted based on the real teaching condition)

Basics 6 hours
Abstract: explain the basic operation of knowledge through this unit of study , students can learn the characteristics of R , resource acquisition mode , and mastery of basic programming
Case and Exercise: Using the R language completion of certain Euler Project (euler project)

* How to learn R
* How to get help
* R language resources and books
* RStudio
* Expansion Pack
* Workspace
* Custom Startup Items
* Batch Mode
* Data Objects
* Custom Functions
* Control statements
* Vectorized operations

Data for two hours

Abstract: explain the various ways the R language read data , the participants through the basic WEB knowledge of web crawling , connect to the database via sql statement calling data from a variety of local read excel file data .
Case studies and exercises: crawl watercress data on the site , write a custom function .

* Web data capture
* API data source
* Connect to the database
* Local Documentation
* Other data sources
* Data Export

Data collation 3 hours

Abstract: how to manipulate the data use R for the all kinds of data conversion, especially for string operation processing .
Case studies and exercises : Find the QQ(the most used instant messager tool) group , then discuss research options with text features.

* Data sorting
* Merge Data
* Summary data
* Remodeling Data
* Take a subset of data
* String manipulation
* Date Actions

Data Visualization 3 hours

Abstract: cover two advanced drawing package , lattice and ggplot2, understand the various methods of visualization to explore.
Case and Exercise: Using graphics to right before the movie , text and other data to describe

* Histogram
* Point
* Column
* Line
* Pie
* Box Plot
* Scatter
* Matrix related
* Map

Elementary statistical methods 5 hours
Abstract: The primary explanation to use R for statistical analysis , regression analysis, students can master the basic statistical significance and role model.
Case and Exercise: Using regression to predict commodity prices ; simulated casino game winner.

* Descriptive Statistics
* Statistical Distributions
* Frequency and contingency tables
* Correlation
* T test
* Non-parametric statistics
* Linear Regression
* Regression Diagnostics
* Robust Regression
* Nonlinear regression
* Principal Component Analysis
* Logistic Regression
* Statistical Simulation

Preliminary data mining ( Selected Topics )

Abstract: explain the R language for data mining expansion pack and functions use , students can master the supervised learning and unsupervised learning two mining methods .
Case and Exercise: Use R to participate in Kaggle Data Mining Competition
* General Mining Process
* Rattle bag
* Hierarchical clustering
* K -means clustering
* Decision Trees
* BP neural network

