# R: k-Means Clustering on an Image

September 12, 2014
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Enough with the theory we recently published, let’s take a break and have fun on the application of Statistics used in Data Mining and Machine Learning, the k-Means Clustering.

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. (Wikipedia, Ref 1.)

We will apply this method to an image, wherein we group the pixels into k different clusters. Below is the image that we are going to use,

 Colorful Bird From Wall321

We will utilize the following packages for input and output:

1. jpeg – Read and write JPEG images; and,
2. ggplot2 – An implementation of the Grammar of Graphics.

Let’s get started by downloading the image to our workspace, and tell R that our data is a JPEG file.

### Cleaning the Data

Extract the necessary information from the image and organize this for our computation:

The image is represented by large array of pixels with dimension rows by columns by channels — red, green, and blue or RGB.

### Plotting

Plot the original image using the following codes:

### Clustering

Apply k-Means clustering on the image:

Plot the clustered colours:

Possible clusters of pixels on different k-Means:

Original k = 6
Table 1: Different k-Means Clustering.
k = 5 k = 4
k = 3 k = 2

I suggest you try it!

### Reference

1. K-means clustering. Wikipedia. Retrieved September 11, 2014.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.