1739 search results for "regression"

Natural language processing tutorial

June 25, 2013
By
Natural language processing tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. The goal is to provide a reasonable baseline on top of which more complex natural language processing can be...

Read more »

Natural Language Processing Tutorial

June 25, 2013
By
Natural Language Processing Tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. The goal is to provide a reasonable baseline on top of which more complex natural language processing can be done, and...

Read more »

Natural Language Processing Tutorial

June 25, 2013
By
Natural Language Processing Tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. The goal is to provide a reasonable baseline on top of which more complex natural language processing can be done, and...

Read more »

GRNN and PNN

June 23, 2013
By
GRNN and PNN

From the technical prospective, people usually would choose GRNN (general regression neural network) to do the function approximation for the continuous response variable and use PNN (probabilistic neural network) for pattern recognition / classification problems with categorical outcomes. However, from the practical standpoint, it is often not necessary to draw a fine line between GRNN

Read more »

Measuring Associations

June 20, 2013
By
Measuring Associations

In Chapter 18, we discuss a relatively new method for measuring predictor importance called the maximal information coefficient (MIC). The original paper is by Reshef at al (2011). A summary of the initial reactions to the MIC are Speed and Tibshirani (and others can be found here). My (minor) beef with it is the lack...

Read more »

Bayesian Modeling of Anscombe’s Quartet

June 20, 2013
By
Bayesian Modeling of Anscombe’s Quartet

Anscombe’s quartet is a collection of four datasets that look radically different yet result in the same regression line when using ordinary least square regression. The graph below shows Anscombe’s quartet with imposed regression lines (taken from the Wikipedia article). While least square regression is a good choice for dataset 1 (upper left plot) it...

Read more »

Data Science Labs: Predictive Models to Improve Vaccine Quality and Production

June 20, 2013
By
Data Science Labs: Predictive Models to Improve Vaccine Quality and Production

The age of "blockbuster drugs" is coming to an end, as personalized medicine becomes a reality. Data science will be a major driver of innovation in these and other areas of the pharmaceutical industry. This was demonstrated during a project the Data Science Labs team executed on with a major pharmaceuticals company.

Read more »

A Toy Instrumental Variable Application

June 19, 2013
By
A Toy Instrumental Variable Application

An R package for Smith-Wilson yield curves

June 19, 2013
By
An R package for Smith-Wilson yield curves

Yield Curve fitting - the Smith-Wilson method Yield Curve fitting - the Smith-Wilson method This article illustrates the R package SmithWilsonYieldCurve, and provides some additional background on yield curve fitting. The method implemented in the package fits a curve to interest rate market...

Read more »

Oracle R Connector for Hadoop 2.1.0 released

June 17, 2013
By

(This article was first published on Oracle R Enterprise, and kindly contributed to R-bloggers) Oracle R Connector for Hadoop (ORCH), a collection of R packages that enables Big Data analytics using HDFS, Hive, and Oracle Database from a local R environment, continues to make advancements. ORCH 2.1.0 is now available, providing a flexible framework while remarkably improving performance and...

Read more »