Posts Tagged ‘ machine learning ’

Splitting a Dataset Revisited: Keeping Covariates Balanced Between Splits

March 8, 2011
By
Splitting a Dataset Revisited: Keeping Covariates Balanced Between Splits

In my previous post I showed you how to randomly split up a dataset into training and testing datasets. (Thanks to all those who emailed me or left comments letting me know that this could be done using other means. As things go with R, it's sometimes ...

Read more »

Split a Data Frame into Testing and Training Sets in R

February 24, 2011
By

I recently analyzed some data trying to find a model that would explain body fat distribution as predicted by several blood biomarkers. I had more predictors than samples (p>n), and I didn't have a clue which variables, interactions, or quadratic terms made biological sense to put into a model.I then turned to a few data mining procedures that I...

Read more »

Split a Data Frame into Testing and Training Sets in R

February 24, 2011
By

I recently analyzed some data trying to find a model that would explain body fat distribution as predicted by several blood biomarkers. I had more predictors than samples (p>n), and I didn't have a clue which variables, interactions, or quadratic terms made biological sense to put into a model.I then turned to a few data mining procedures that I...

Read more »

Data Mining with WEKA

January 30, 2011
By

There are a number of good open source projects for statistics and data mining, for example the software WEKA developed at the University of Waikato. The description on their website states that: Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or

Read more »

NIPS 2010: Monte Carlo workshop

September 3, 2010
By
NIPS 2010: Monte Carlo workshop

In the wake of the main machine learning NIPS 2010 meeting in Vancouver, Dec. 6-9 2010, there will be a very interesting workshop organised by Ryan Adams, Mark Girolami, and Iain Murray on Monte Carlo Methods for Bayesian Inference in Modern Day Applications, on Dec. 10. (And in Whistler, not Vancouver!) I wish I could

Read more »

Want to join the closed BETA of a new Statistical Analysis Q&A site – NOW is the time!

July 16, 2010
By

The bottom line of this post is for you to go to: Stack Exchange Q&A site proposal: Statistical Analysis And commit yourself to using the website for asking and answering questions. (And also consider giving the contender, MetaOptimize a visit) * * * * Statistical analysis Q&A website is about to go into BETA A month ago I invited...

Read more »

Top 10 Algorithms in Data Mining

April 23, 2010
By

The authors here invited ACM KDD Innovation Award and IEEE ICDM Research Contributions Award winners to each nominate up to 10 best-known algorithms in data mining, including the algorithm name, justification for nomination, and a representative public...

Read more »

Social Media Analytics Research Toolkit (SMART@znmeb) Is Moving Into Private Beta

March 31, 2010
By

Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit My Social Media Analytics Research Toolkit is about to move into private beta. What's in the release?...

Read more »

Weighting model fit with ctree in party

March 15, 2010
By
Weighting model fit with ctree in party

Conditional inference trees (ctree) in package party allows weighting which is useful when one classification outcome is more important than another. Useful examples are not difficult to imagine: in a marketing direct mailing, a false positive (non-res...

Read more »

Compare performance of machine learning classifiers in R

December 23, 2009
By
Compare performance of machine learning classifiers in R

This tutorial demonstrates to the R novice how to create five machine learning models for classification and compare the performance graphically with ROC curves in one chart. For a simpler introduction, start with Plot ROC curve and lift chart in R. # ...

Read more »