Utilizing K-means to extract colours from your favourite images

November 11, 2018
By

(This article was first published on R on Chi's Impe[r]fect Blog, and kindly contributed to R-bloggers)




I have been playing with package called imager, documentation on this package was extremely helpful! I have read through “getting started” as well as few other tutorial & examples.

I love colours… Tools like colourlovers, Adobe Color CC, Canva Color Palette Generators are great extracting colours from photo (or in general just getting inspirations for colour palette), but I wanted to figure out the way to extract colours from image using R.

Preps

Loading up packages & loading up image to R, and extracting RGB info out of image. First I need to load up image. I’m using below abstract image with bunch of different colours that I’ve created with Photoshop just for fun.

library(tidyverse) ##
library(imager) ## such a fun package.  I want to learn more.
library(treemap) ## 
library(ggvoronoi) ## because I'm currently addicted to voronoi diagram.

## Load up the image using load.image function! 
im <- load.image("https://farm4.staticflickr.com/3316/3333507738_9d36d39f6d_b.jpg") ## colourful abstract image. 
#im <- load.image("https://farm9.staticflickr.com/8125/8659010017_54a885f12a_z.jpg") ## mainly blue
#im <- load.image("https://farm2.staticflickr.com/1939/30915465767_2d9a733510_z.jpg") ## animoji! 

## View the image I just loaded. 
plot(im, main="Original Image I Want To Get Some Colours Out Of!")

df_size <-dim(im)[1]*dim(im)[2]  ## numbers of row I'd get if i use it as it is...
max_row_num <- 150000 ## number of maximum row I want (just to limit my size!)

## If df is too big, it's too slow to process on my computer, so shrink the image
shrink_ratio  <- if(df_size > max_row_num) {max_row_num / df_size } else {1}
im <- im %>% imresize(shrink_ratio)
# plot(im) if you want to check how the image has been resized.

## get RGB value at each pixel
im_rgb <- im %>% 
  as.data.frame(wide="c") %>%
  rename(red=c.1,green=c.2,blue=c.3) %>%
  mutate(hexvalue = rgb(red,green,blue)) ## you can create hexvalue using red, green blue value!

## turn image into Grayscale and get luminance "value" too. 
im_gray <- im %>%
  grayscale() %>%
  as.data.frame() 

## combine RGB info and Luminance Value Dataset together.
im_df <- im_rgb %>% 
  inner_join(im_gray) 

im_df %>% head()
##   x y         red     green      blue hexvalue     value
## 1 1 1 0.001528153 0.6012714 0.7346048  #0099BB 0.4360151
## 2 2 1 0.003336439 0.5943815 0.7305651  #0198BA 0.4320482
## 3 3 1 0.005112951 0.6265604 0.7657791  #01A0C3 0.4554402
## 4 4 1 0.005520480 0.6726386 0.8097380  #01ACCE 0.4875841
## 5 5 1 0.011926270 0.7021898 0.8367487  #03B3D5 0.5099122
## 6 6 1 0.011128983 0.6695389 0.8149510  #03ABD0 0.4880113

Applying K-Means Algorithm on RGB value from Image

Using kmeans function is pretty simple, I’m selecting 12 as k in below example, simply because I wanted to get 12 distinct colours from the picture.

## Pick k value to run kMean althorithm.
## But to extract colours, I'd pick k as number I want back! 
my_k <- 12

## Running kmeans algorithm on red, green and blue value to gather similar colour together
kmean_rgb <- kmeans(im_df %>% select(red,green,blue), centers=my_k)

## append cluster id to im_df datasets.
im_df$cluster_num <- kmean_rgb$cluster

## center values can be used as cluster colour! 
kmean_center <- kmean_rgb$centers %>% as.data.frame() %>% 
  mutate(group_hex = rgb(red,green,blue), cluster_num = row_number()) %>%
  inner_join(im_df %>% count(cluster_num))

## I can also save the colour palette for future use as well.
my_colour <- kmean_center$group_hex
my_colour
##  [1] "#12AAD8" "#F178C3" "#E9E0CE" "#DF3658" "#933DB6" "#659634" "#E62AB1"
##  [8] "#73D2B0" "#F5E86F" "#EAC547" "#4363C2" "#93D55F"

Viewing Colour Palettes

I wanted to view the colour palette more visually, instead of HexValues….

num_row = floor(my_k/4)
kmean_center %>% 
  ggplot(aes(x=(cluster_num-1)%%num_row,y=floor((cluster_num-1)/num_row))) + 
  geom_tile(aes(fill=group_hex)) +
  geom_label(aes(label=paste(cluster_num,":",group_hex,"\n",n,"distinct hexvalues")), 
             family="Roboto Condensed", lineheight=0.8) +
  scale_fill_manual(values=sort(kmean_center$group_hex), guide="none") +
  theme_void(base_family="Roboto Condensed") +
  labs(subtitle=paste0("k-Mean clustering center colour with ", my_k, " clusters" )) +
  scale_y_reverse()

Which colours get clustered together?

I wanted to see which colours were bundled together in same cluster. I took a sample because plotting more colours simply just took way too long on my machine…

im_df %>% 
  sample_n(size=10000) %>% ## I'm just going to take sample to make the drawing bit faster...
  count(cluster_num,hexvalue) %>%
  treemap(index=c("cluster_num","hexvalue"),
          type="color",
          vSize="n", 
          vColor="hexvalue",
          algorithm = "squarified",
          fontfamily.labels="Roboto Condensed", 
          fontfamily.title="Roboto Condensed",
          border.col=c("#ffffff","#ffffff50"),
          fontsize.labels=c(24,0),
          aspRatio=16/9,
          title="Clustering with RGB")

Bonus: Voronoi Abstract Art

Below is more fun to do on other images, but I wanted to just use ggvoronoi package because I’m currently in love

Looks like stained glass! 🙂

## Create Mini sets, choose to select grayscale value that are bigger more often.
im_df_mini <-im_df %>%
  sample_n(size=1000, weight=(1-value)) 

im_df_mini %>%
  ggplot(aes(x=x,y=y)) +
  geom_voronoi(aes(fill=hexvalue), color="#000000", size=0.1) +
  scale_fill_manual(values=sort(unique(im_df_mini$hexvalue)), guide="none") +
  theme_void() +
  scale_y_reverse() +
  coord_fixed()

Using Plotly package

I’ve always been curious to use plotly, but I haven’t had chance to yet. Since RGB values are 3 different values, I wanted to use xyz axis to plot (and I didn’t know how to do that in ggplot2), so here’s yet another way to view how the colour got clustered more visually.

library(plotly)
im_df %>% sample_n(size=2000) %>% 
  plot_ly(x = ~red, y=~blue, z=~green, color=~cluster_num, colors=my_colour, size=~value) %>%
  add_markers()   

To leave a comment for the author, please follow the link and comment on their blog: R on Chi's Impe[r]fect Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)