# K-means in images and R

(This article was first published on jkunst.com: Entries for category R, and kindly contributed to R-bloggers)

In an image each pixel is a color and that color hava a RGB representation, this means each color is represented by a triplet. For example the red in this representation is (250, 0, 0), black is (0,0,0) and white is (250, 250, 250) (see here more colors) . More generaly, each color is a point in the 3D cube [(0, 0, 0);(250, 250, 250)]. We'll transform an image to a data.frame with 3 columns where each observation it's a pixel. For make this we need to ReadImages and rgl libraries. Let's take a look of the fist part of the script:

rm(list=ls())
library(rgl)

# Read the image
str(image)
plot(image)

# Obtaing the size of image
H <- dim(image)[1]
W <- dim(image)[2]

# Creating the data frame
rgb_image <- data.frame(r = as.vector(image[1:H, 1:W, 1]),
g = as.vector(image[1:H, 1:W, 2]),
b = as.vector(image[1:H, 1:W, 3]))

# I prefer work whit the rgb transformation
rgb_image <- round(rgb_image*250)


And we obtain the plot of te original image.

Now for each obervation we'll obtain the rgb representation. Then we take a sample and plot those points with the respective color and this is the result.

rgb_image$colors_hex <- rgb(rgb_image, max = 255) rgb_image_sample <- rgb_image[sample(1:nrow(rgb_image), size = 6000),] with(rgb_image_sample,{ plot3d(r, g, b, col = colors_hex, type='s', size=1, main = "Original Colors", xlab = "Red", ylab = "Green", zlab = "Blue") }) movie3d(spin3d(axis=c(0,0,1)), duration=7, fps=10, movie = "colors", type = "gif")  There are black, blue and red points, and there are some yellow (red + green) points. Now we proceed to apply the k-mean algorithm over this points and plot the result.  kms <- kmeans(rgb_image_sample[,1:3], centers=3) rgb_image_sample$color_kmeans <- rgb(kms$centers[kms$cluster,], max = 255)

with(rgb_image_sample,{
plot3d(r, g, b, col = color_kmeans, type='s', size=1, main = "Color Clusters",
xlab = "Red", ylab = "Green", zlab = "Blue", scale = 0.2)
})
movie3d(spin3d(axis=c(0,0,1)), duration=5, fps=40, movie = "colors_kmeans", type = "gif")


This is for illustrate the problem but now we apply for the entire image and see the result.

kms <- kmeans(rgb_image[,1:3], centers=3)
rgb_new <- kms$centers[kms$cluster, ]
# Back to the original representation (intensity between (0,1))
rgb_new <- rgb_new/250
image_new <- image

image_new[1:H, 1:W, 1] <- rgb_new[,1]
image_new[1:H, 1:W, 2] <- rgb_new[,2]
image_new[1:H, 1:W, 3] <- rgb_new[,3]

plot(image_new)

Finally we can loop over the k parameter and explore the evolution of the image.

Maybe this is not a very useful application but it's a very good way to understand the idea of the k-means algorithm.

To leave a comment for the author, please follow the link and comment on his blog: jkunst.com: Entries for category R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...