R: k-Means Clustering on an Image

Posted on September 12, 2014 by Al-Ahmadgaid Asaad in R bloggers | 0 Comments

[This article was first published on Analysis with Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Enough with the theory we recently published, let’s take a break and have fun on the application of Statistics used in Data Mining and Machine Learning, the k-Means Clustering.

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. (Wikipedia, Ref 1.)

We will apply this method to an image, wherein we group the pixels into k different clusters. Below is the image that we are going to use,

Colorful Bird From Wall321

We will utilize the following packages for input and output:

jpeg – Read and write JPEG images; and,
ggplot2 – An implementation of the Grammar of Graphics.

Download and Read the Image

Let’s get started by downloading the image to our workspace, and tell R that our data is a JPEG file.

Cleaning the Data

Extract the necessary information from the image and organize this for our computation:

The image is represented by large array of pixels with dimension rows by columns by channels — red, green, and blue or RGB.