Posts Tagged ‘ Clustering ’

A New Dimension to Principal Components Analysis

October 27, 2011
By
A New Dimension to Principal Components Analysis

In general, the standard practice for correcting for population stratification in genetic studies is to use principal components analysis (PCA) to categorize samples along different ethnic axes.  Price et al. published on this in 20...

Read more »

Example 9.8: New stuff in SAS 9.3– Bayesian random effects models in Proc MCMC

October 4, 2011
By
Example 9.8: New stuff in SAS 9.3– Bayesian random effects models in Proc MCMC

Rounding off our reports on major new developments in SAS 9.3, today we'll talk about proc mcmc and the random statement.Stand-alone packages for fitting very general Bayesian models using Markov chain Monte Carlo (MCMC) methods have been available for...

Read more »

How to do a quantitative literature review in R

May 17, 2011
By
How to do a quantitative literature review in R

In the early stages of a literature review, you may have hundreds of papers and not know how to even begin sorting through them. In this post, I show you how to perform a two-stage clustering analysis with R so that you can identify the main groups within your data, based on key attributes of each paper.

Read more »

Counting Clusters

March 13, 2011
By
Counting Clusters

Given a set of numerical datapoints, we often want to know how many clusters the datapoints form. Two practical algorithms for determining the number of clusters are the gap statistic and the prediction strength. Gap Statistic The gap statistic algorithm … Continue reading →

Read more »

web content anlayzer

January 6, 2011
By
web content anlayzer

Just developed a small crawler to check my online content at binfalse.de in terms of W3C validity and the availability of external links. Here is the code and some statistics...

Read more »

Welcome Rumpel!

October 18, 2010
By

Ladies and gentlemen, waiting is finally over. I'm proud to introduce Rumpel!

Read more »

Clustergram: visualization and diagnostics for cluster analysis (R code)

June 15, 2010
By
Clustergram: visualization and diagnostics for cluster analysis (R code)

About Clustergrams In 2002, Matthias Schonlau published in “The Stata Journal” an article named “The Clustergram: A graph for visualizing hierarchical and . As explained in the abstract: In hierarchical cluster analysis dendrogram graphs are used to visualize how clusters are formed. I propose an alternative graph named “clustergram” to examine how cluster members are assigned to clusters as...

Read more »

Top 10 Algorithms in Data Mining

April 23, 2010
By

The authors here invited ACM KDD Innovation Award and IEEE ICDM Research Contributions Award winners to each nominate up to 10 best-known algorithms in data mining, including the algorithm name, justification for nomination, and a representative public...

Read more »

Social Media Analytics Research Toolkit (SMART@znmeb) Is Moving Into Private Beta

March 31, 2010
By

Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit My Social Media Analytics Research Toolkit is about to move into private beta. What's in the release?...

Read more »

Augmented support for complex survey designs in R

March 3, 2010
By
Augmented support for complex survey designs in R

We'll get back to code examples later this week, but wanted to let you know about an R package with updated functionality in the meantime.The appropriate analysis of sample surveys requires incorporation of complex design features, including stratification, clustering, weights, and finite population correction. These can be address in SAS and R for many common models. Section...

Read more »