Posts Tagged ‘ machine learning ’

An Intro to Ensemble Learning in R

January 19, 2012
By
An Intro to Ensemble Learning in R

IntroductionThis post incorporates parts of yesterday's post about bagging. If you are unfamiliar with bagging, I suggest that you read it before continuing with this article.I would like to give a basic overview of ensemble learning. Ensemble learni...

Read more »

Decoding a Substitution Cipher using Simulated Annealing

January 1, 2012
By
Decoding a Substitution Cipher using Simulated Annealing

My last post discussed a method to decode a substitution cipher using a Metropolis-Hastings algorithm. It was brought to my attention that this code could be improved by using Simulated Annealing methods to jump around the sample space and avoid some of the local maxima. Here is a basic description of the difference: In...

Read more »

Using neural network for regression

November 17, 2011
By
Using neural network for regression

Artificial neural networks are commonly thought to be used just for classification because of the relationship to logistic regression: neural networks typically use a logistic activation function and output values from 0 to 1 like logistic regression. ...

Read more »

Train neural network in R, predict in SAS

November 11, 2011
By
Train neural network in R, predict in SAS

This R code fits an artificial neural network in R and generates Base SAS code, so new records can be scored entirely in Base SAS. This is intended to be a simple, elegant, fast solution. You don’t need SAS Enterprise Miner, IML, or any other spe...

Read more »

Model decision tree in R, score in Base SAS

October 11, 2011
By
Model decision tree in R, score in Base SAS

This code creates a decision tree model in R using party::ctree() and prepares the model for export it from R to Base SAS, so SAS can score new records. SAS Enterprise Miner and PMML are not required, and Base SAS can be on a separate machine from R be...

Read more »

SIGKDD 2011 Conference — Days 2/3/4 Summary

August 27, 2011
By
SIGKDD 2011 Conference — Days 2/3/4 Summary

<< My review of Day 1.

I am summarizing all of the days together since each talk was short, and I was too exhausted to write a post after each day. Due to the broken-up schedule of the KDD sessions, I group everything together instead of switching back and forth among a dozen different topics....

Read more »

SIGKDD 2011 Conference — Day 1 (Graph Mining and David Blei/Topic Models)

August 22, 2011
By
SIGKDD 2011 Conference — Day 1 (Graph Mining and David Blei/Topic Models)

I have been waiting for the KDD conference to come to California, and I was ecstatic to see it held in San Diego this year. AdMeld did an awesome job displaying KDD ads on the sites that I visit, sometimes multiple times per page. That’s good targeting!

Mining and Learning on Graphs Workshop 2011

I had...

Read more »

Benchmarking R, Revolution R, and HyperThreading for data mining

June 27, 2011
By
Benchmarking R, Revolution R, and HyperThreading for data mining

I recently upgraded my notebook (where I often use R for data mining) and was faced with two questions: for the fastest speed for building models, do I use the R or Revolution R, and do I enable Hyper-Threading? Revolution Analytics provides Revolution...

Read more »

Review of 2011 Data Scientist Summit

May 13, 2011
By
Review of 2011 Data Scientist Summit

Some time over the past 6 weeks I randomly saw a tweet announcing the “Data Scientist Summit” and shortly below it I saw that it would be held in Las Vegas at the Venetian. Being a Data Scientist myself is reason enough to not pass up this opportunity, but Vegas definitely sweetens the deal!...

Read more »

Text Data Mining with Twitter and R

April 8, 2011
By
Text Data Mining with Twitter and R

Twitter is a favorite source of text data for analysis: it’s popular (there is a huge volume of variety on all topics) and easily accessible using Twitter’s free, open APIs which are easily consumable in JSON and ATOM formats. Some people h...

Read more »