331 search results for "evaluation"

Classifying the UCI mushrooms

In my last post, I considered the shifts in two interestingness measures as possible tools for selecting variables in classification problems.  Specifically, I considered the Gini and Shannon interestingness measures applied to the 22 categorical mushroom characteristics from the UCI mushroom dataset.  The proposed variable selection strategy was to compare these values when computed from only edible mushrooms...

Read more »

PDF slides and R code examples on Data Mining and Exploration

June 4, 2012
By
PDF slides and R code examples on Data Mining and Exploration

by Yanchang Zhao, RDataMining.com There are some nice slides and R code examples on Data Mining and Exploration at http://www.inf.ed.ac.uk/teaching/courses/dme/, which are listed below. PDF Slides: - Overview of Data Mining http://www.inf.ed.ac.uk/teaching/courses/dme/2012/slides/datamining_intro4up.pdf - Visualizing Data http://www.inf.ed.ac.uk/teaching/courses/dme/2012/slides/visualisation4up.pdf - Decision trees http://www.inf.ed.ac.uk/teaching/courses/dme/2012/slides/classification4up.pdf … Continue reading →

Read more »

Selection in R

June 1, 2012
By

The design of the statistical programming language R sits in a slightly uncomfortable place between the functional programming and object oriented paradigms. The upside is you get a lot of the expressive power of both programming paradigms. A downside of this is: the not always useful variability of the language’s list and object extraction operators. Related posts:

Read more »

R Tops Data Mining Software Poll

May 31, 2012
By

For the past 12 years, KDNuggets has conducted an annual poll asking "What analytics/data mining software you used in the past 12 months for a real project (not just evaluation)". In this year's poll, R was the top-ranked data mining solution, selected by 30.7% of poll respondents. Microsoft Excel was second, at 29.8%. Rapidminer, which took the #1 spot...

Read more »

CFP: the 10th Australasian Data Mining Conference (AusDM 2012)

May 20, 2012
By
CFP: the 10th Australasian Data Mining Conference (AusDM 2012)

The Tenth Australasian Data Mining Conference (AusDM 2012) Sydney, Australia 5-7 December 2012 http://ausdm12.togaware.com/ Data mining, the art and science of intelligent analysis of (usually large) data sets for meaningful (and previously unknown) insights, is now being actively applied in … Continue reading →

Read more »

Bivariate linear mixed models using ASReml-R with multiple cores

May 7, 2012
By
Bivariate linear mixed models using ASReml-R with multiple cores

A while ago I wanted to run a quantitative genetic analysis where the performance of genotypes in each site was considered as a different trait. If you think about it, with 70 sites and thousands of genotypes one is trying … Continue reading →

Read more »

2nd round of call for chapter proposals for book Data Mining Applications with R: due by 31 May

May 2, 2012
By
2nd round of call for chapter proposals for book Data Mining Applications with R: due by 31 May

2nd CALL FOR CHAPTERS: proposals due by 31 May 2012 Data Mining Applications with R A book to be published by Elsevier http://www.RDataMining.com/books/book2 Introduction —————— R is one of the most widely used data mining tools in scientific and business … Continue reading →

Read more »

Late-April flotsam

April 25, 2012
By
Late-April flotsam

It has been month and a half since I compiled a list of statistical/programming internet flotsam and jetsam. Via Lambda The Ultimate: Evaluating the Design of the R Language: Objects and Functions For Data Analysis (PDF). A very detailed evaluation … Continue reading →

Read more »

Introduction to Oracle R Connector for Hadoop

April 23, 2012
By

MapReduce, the heart of Hadoop, is a programming framework that enables massive scalability across servers using data stored in the Hadoop Distributed File System (HDFS). The Oracle R Connector for Hadoop (ORCH) provides access to a Hadoop cluster from R, enabling manipulation of HDFS-resident data and the execution of MapReduce jobs. Conceptutally, MapReduce is similar...

Read more »

Case Study: Network visualization with data from a 360° feedback – often wasted potential!

April 13, 2012
By
Case Study: Network visualization with data from a 360° feedback – often wasted potential!

I assume that the reader of this paper knows the 360-degree method (also known as: multi-source feedback or management feedback). Reported is an authentic case. A total of 128 people participated as feedback receivers. Several thousand questionnaires were filled from … Weiterlesen →

Read more »