Posts Tagged ‘ Kaggle ’

Dark matter benchmarks: All over the map

October 14, 2012
By
Dark matter benchmarks: All over the map

The three benchmark algorithms for predicting the location of dark matter halos are, for the most part, all over the map. Most of the test skies look something like this: There are, however, some skies with rather strong halo signals that get a decent amount of agreement: The Lenstool MLE algorithm is the current state

Read more »

Observing Dark Worlds – Visualizing dark matter’s distorting effect on galaxies

October 13, 2012
By
Observing Dark Worlds – Visualizing dark matter’s distorting effect on galaxies

Some people like to do crossword puzzles. I like to do machine learning puzzles. Lucky for me, a new contest was just posted yesterday on Kaggle. So naturally, my lazy Saturday was spent getting elbow deep into the data. The training set consists of a series of ‘skies’, each containing a bunch of galaxies. Normally,

Read more »

The essence of a handwritten digit

August 13, 2012
By
The essence of a handwritten digit

If you haven’t yet discovered the competitive machine learning site kaggle.com, please do so now. I’ll wait. Great – so, you checked it out, fell in love and have made it back. I recently downloaded the data for the getting started competition. It consists of 42000 labelled images (28×28) of hand written digits 0-9. The

Read more »

Error metrics for multi-class problems in R: beyond Accuracy and Kappa

July 6, 2012
By
Error metrics for multi-class problems in R: beyond Accuracy and Kappa

The caret package for R provides a variety of error metrics for regression models and 2-class classification models, but only calculates Accuracy and Kappa for multi-class models.  Therefore, I wrote the following function to allow caret:::train t...

Read more »

Kaggle on TV

September 22, 2011
By

It is good to see forecasting algorithms getting some mainstream exposure on ABC Catalyst.

Read more »

Wikipedia for Kaggle Participants

July 1, 2011
By

Kaggle has released a new data-mining challenge: use data from 10 years of Wikipedia edits in order to predict future edit rates. The dataset has been anonymized in order to obscure editor identity and article identity, simultaneously adding focus to the challenge and robbing the dataset of considerable richness. I have some experience with wikipedia…

Read more »