data mining

Data Frames and Transactions

September 24, 2012 | Wesley

Transactions are a very useful tool when dealing with data mining.  It provides a way to mine itemsets or rules on datasets. In R the data must be in transactions form.  If the data is only available in a data.frame then to create (or coerce) the data frame to ... [Read more...]

2nd CFP: the 10th Australasian Data Mining Conference (AusDM 2012)

July 10, 2012 | Yanchang Zhao

The Tenth Australasian Data Mining Conference (AusDM 2012) Sydney, Australia, 5-7 December 2012 http://ausdm12.togaware.com/ The Australasian Data Mining Conference has established itself as the premier Australasian meeting for both practitioners and researchers in data mining. This year’s conference, AusDM’12, co-hosted … Continue reading → [Read more...]

Data Mining In Excel: Lecture Notes and Cases

July 10, 2012 | Yanchang Zhao

by Yanchang Zhao, RDataMining.com It is a 270-page book on data mining with Excel. It can be downloaded as a PDF file at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.1393&rep=rep1&type=pdf. Below is its table of contents. - Overview of the Data Mining Process - ... [Read more...]

A tutorial on outlier detection techniques

July 4, 2012 | Yanchang Zhao

by Yanchang Zhao, RDataMining.com There is an excellent tutorial on outlier detection techniques, presented by Hans-Peter Kriegel et al. at ACM SIGKDD 2010. It presents many popular outlier detection algorithms, most of which were published between mid 1990s and 2010, … Continue reading → [Read more...]

An example on sentiment analysis with R

June 21, 2012 | Yanchang Zhao

by Yanchang Zhao, RDataMining.com There is a nice example on sentiment analysis with R at . In the example, the Wikileaks cable corpus is analyzed to track US sentiments of other countries and their presidents over time. The example describes … Continue reading → [Read more...]

Your Data is Never the Right Shape

July 31, 2011 | John Mount

One of the recurring frustrations in data analytics is that your data is never in the right shape. Worst case: you are not aware of this and every step you attempt is more expensive, less reliable and less informative than you would want. Best case: you notice this and have ... [Read more...]

Wikipedia for Kaggle Participants

July 1, 2011 | Adam.Hyland

Kaggle has released a new data-mining challenge: use data from 10 years of Wikipedia edits in order to predict future edit rates. The dataset has been anonymized in order to obscure editor identity and article identity, simultaneously adding focus to the challenge and robbing the dataset of considerable richness. I have ... [Read more...]

Visualizing Facebook Friends: Eye Candy in R

December 18, 2010 | Paul Butler

Earlier this week I published a data visualization on the Facebook Engineering blog which, to my surprise, has received a lot of media covereage. I’ve received a lot comments about the image, many asking for more details on how I … Continue reading → [Read more...]

Transactions, and Pondering their Use in Casinos

October 20, 2010 | Ryan

A couple of weeks ago, Bradford Cross of FlightCaster posted in Measuring Measures that transactions are the next big data category. I argue that they already are, and from reading his blog post, he seems to suggest this as well but I will admit that I think I missed his ...
[Read more...]

Lists of English Words

October 12, 2010 | Ryan

When I was a kid, I went through an 80s music phase…well, some things never change. “People just love to play with words…” Know that song? Anyway… One of the biggest pains of text mining and NLP is colloquialism — language that is only appropriate in casual language and not ...
[Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)