**Data Perspective**, and kindly contributed to R-bloggers)

For example, from a ticket booking engine database identifying clients with similar booking activities and group them together (called Clusters). Later these identified clusters can be targeted for business improvement by issuing special offers, etc.

Cluster Analysis falls into Unsupervised Learning algorithms, where in Data to be analyzed will be provided to a Cluster analysis algorithm to identify hidden patterns within as shown in the figure below.

In the image above, the cluster algorithm has grouped the input data into two groups. There are 3 Popular Clustering algorithms, Hierarchical Cluster Analysis, K-Means Cluster Analysis, Two-step Cluster Analysis, of which today I will be dealing with K-Means Clustering.**Explaining k-Means Cluster Algorithm: **

In K-means algorithm, k stands for the number of clusters (groups) to be formed, hence this algorithm can be used to group known number of groups within the Analyzed data.

K Means is an iterative algorithm and it has two steps. First is a Cluster Assignment Step, and second is a Move Centroid Step.

CLUSTER ASSIGNMENT STEP: In this step, we randomly chose two cluster points (red dot & green dot) and we assign each data point to one of the two cluster points whichever is closer to it. (Top part of the below image)

MOVE CENTROID STEP: In this step, we take the average of the points of all the examples in each group and move the Centroid to the new position i.e. mean position calculated. (Bottom part of the below image)

The above steps are repeated until all the data points are grouped into 2 groups and the mean of the data points at the end of Move Centroid Step doesn’t change.

By repeating the above steps the final output grouping of the input data will be obtained.

**Cluster Analysis on Accidental Deaths by Natural Causes in India using R**

**leave a comment**for the author, please follow the link and comment on their blog:

**Data Perspective**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...