R: k-Means Clustering on an Image

September 12, 2014

(This article was first published on Analysis with Programming, and kindly contributed to R-bloggers)

Enough with the theory we recently published, let’s take a break and have fun on the application of Statistics used in Data Mining and Machine Learning, the k-Means Clustering.
k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. (Wikipedia, Ref 1.)

We will apply this method to an image, wherein we group the pixels into k different clusters. Below is the image that we are going to use,

Colorful Bird From Wall321

We will utilize the following packages for input and output:

  1. jpeg – Read and write JPEG images; and,
  2. ggplot2 – An implementation of the Grammar of Graphics.

Download and Read the Image

Let’s get started by downloading the image to our workspace, and tell R that our data is a JPEG file.

Cleaning the Data

Extract the necessary information from the image and organize this for our computation:

The image is represented by large array of pixels with dimension rows by columns by channels — red, green, and blue or RGB.


Plot the original image using the following codes:


Apply k-Means clustering on the image:

Plot the clustered colours:

Possible clusters of pixels on different k-Means:

Original k = 6
Table 1: Different k-Means Clustering.
k = 5 k = 4
k = 3 k = 2

I suggest you try it!


  1. K-means clustering. Wikipedia. Retrieved September 11, 2014.

To leave a comment for the author, please follow the link and comment on their blog: Analysis with Programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)