Posts Tagged ‘ data mining ’

Experience on using R to build prediction models in business applications

March 8, 2012
By
Experience on using R to build prediction models in business applications

By Yanchang zhao, RDataMining.com Building prediction/classification models is one of the most widely-seen data mining tasks in business applications. To share experience on building prediction models with R, I have started a discussion at RDataMining group on LinkedIn with the … Continue reading →

Read more »

People voice about Lynas Malaysia through Twitter Analysis with R CloudStat

February 28, 2012
By
People voice about Lynas Malaysia through Twitter Analysis with R CloudStat

People voice about Lynas Malaysia through Twitter Analysis with R CloudStat: CloudStat Analysis: This is a twitter analysis report for “Lynas” from 21 till 28 February 2012, generated by CloudStat Twitter Application. Lynas was a hot topic, espec...

Read more »

A prize of US$3,000,000 for a data mining competition to improve healthcare

December 13, 2011
By
A prize of US$3,000,000 for a data mining competition to improve healthcare

There is a data mining competition with a prize of $3,000,000. The target is to improve healthcare in US by identifying patients who will be admitted to a hospital within the next year, using historical claims data. The algorithm to … Continue reading →

Read more »

Your Data is Never the Right Shape

July 31, 2011
By
Your Data is Never the Right Shape

One of the recurring frustrations in data analytics is that your data is never in the right shape. Worst case: you are not aware of this and every step you attempt is more expensive, less reliable and less informative than you would want. Best case: you notice this and have the tools to reshape your Related posts:

Read more »

Wikipedia for Kaggle Participants

July 1, 2011
By

Kaggle has released a new data-mining challenge: use data from 10 years of Wikipedia edits in order to predict future edit rates. The dataset has been anonymized in order to obscure editor identity and article identity, simultaneously adding focus to the challenge and robbing the dataset of considerable richness. I have some experience with wikipedia…

Read more »

What $480M of Gross Revenue Looks Like to Groupon

February 28, 2011
By
What $480M of Gross Revenue Looks Like to Groupon

On Saturday, the Wall St. Journal posted details of an internal Groupon memo that reported $760 million in revenue last year. The WSJ article came just as I was finishing up a visualization of some data I had collected on … Continue reading →

Read more »

Visualizing Facebook Friends: Eye Candy in R

December 18, 2010
By
Visualizing Facebook Friends: Eye Candy in R

Earlier this week I published a data visualization on the Facebook Engineering blog which, to my surprise, has received a lot of media covereage. I’ve received a lot comments about the image, many asking for more details on how I … Continue reading →

Read more »

Transactions, and Pondering their Use in Casinos

October 20, 2010
By
Transactions, and Pondering their Use in Casinos

A couple of weeks ago, Bradford Cross of FlightCaster posted in Measuring Measures that transactions are the next big data category. I argue that they already are, and from reading his blog post, he seems to suggest this as well but I will admit that I think I missed his point. There are some clear examples of transactions and...

Read more »

How to build a world-beating predictive model using R

How to build a world-beating predictive model using R

Many modern data analysis problems in both industry and academia involve building a model that can predict the future based on historical variables. The 2009 KDD Cup was an international data mining competition devoted to this type of problem, where … Continue reading →

Read more »

Lists of English Words

October 12, 2010
By
Lists of English Words

When I was a kid, I went through an 80s music phase…well, some things never change. “People just love to play with words…” Know that song? Anyway… One of the biggest pains of text mining and NLP is colloquialism — language that is only appropriate in casual language and not in formal speech or writing. Words such as informal contractions...

Read more »