In general, the standard practice for correcting for population stratification in genetic studies is to use principal components analysis (PCA) to categorize samples along different ethnic axes. Price et al. published on this in 20...
In the early stages of a literature review, you may have hundreds of papers and not know how to even begin sorting through them. In this post, I show you how to perform a two-stage clustering analysis with R so that you can identify the main groups within your data, based on key attributes of each paper.
Given a set of numerical datapoints, we often want to know how many clusters the datapoints form. Two practical algorithms for determining the number of clusters are the gap statistic and the prediction strength. Gap Statistic The gap statistic algorithm … Continue reading →![]()
Just developed a small crawler to check my online content at binfalse.de in terms of W3C validity and the availability of external links. Here is the code and some statistics...
Ladies and gentlemen, waiting is finally over. I'm proud to introduce Rumpel!
About Clustergrams In 2002, Matthias Schonlau published in “The Stata Journal” an article named “The Clustergram: A graph for visualizing hierarchical and . As explained in the abstract: In hierarchical cluster analysis dendrogram graphs are used to visualize how clusters are formed. I propose an alternative graph named “clustergram” to examine how cluster members are assigned to clusters as...
The authors here invited ACM KDD Innovation Award and IEEE ICDM Research Contributions Award winners to each nominate up to 10 best-known algorithms in data mining, including the algorithm name, justification for nomination, and a representative public...
Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit My Social Media Analytics Research Toolkit is about to move into private beta. What's in the release?...
We'll get back to code examples later this week, but wanted to let you know about an R package with updated functionality in the meantime.The appropriate analysis of sample surveys requires incorporation of complex design features, including stratification, clustering, weights, and finite population correction. These can be address in SAS and R for many common models. Section...