# Image Classification in R using trained TensorFlow models

[This article was first published on

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

**Random Thoughts on R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

#
Recently RStudio has released a package that allows to use TensorFlow in R. Also recently several trained models for image classification have been released. In this post I describe how to use the VGG16 model in R to produce an image classification like this:
(image taken from: https://www.flickr.com/photos/jonner/487461265)
The code is available on github. In that directory there is also a python file `load_vgg16.py`

for checking the validity of the R-code against the python implementation in which the models are published.
As a first step we download the VGG16 weights `vgg_16.tar.gz`

from here and extract it. You should get a file named `vgg_16.ckpt`

which we will need later.

`load_vgg16.py`

for checking the validity of the R-code against the python implementation in which the models are published.`vgg_16.tar.gz`

from here and extract it. You should get a file named `vgg_16.ckpt`

which we will need later.## Building the model

#
We now define the model. Note that since the network is written in the default graph, we do a clean start by resetting the default graph.
library(tensorflow)
slim = tf$contrib$slim #Poor mans import tensorflow.contrib.slim as slim
tf$reset_default_graph() # Better to start from scratch

We start with a placeholder tensor in which we later feed the images. The model works on a batch of images and thus needs a tensor of order 4 (an array having 4 indices). The first index of the tensor counts the image number and the second to 4th index is for the width, height, color. Since we want to allow for an arbitrary number of images of arbitrary size, we leave these dimensions open. We only specify that there should be 3 color channels (rgb). Then these images are rescaled with TensorFlow to the size (224, 224) as needed by the network.
# Resizing the images
images = tf$placeholder(tf$float32, shape(NULL, NULL, NULL, 3))
imgs_scaled = tf$image$resize_images(images, shape(224,224))

We are now defining the VGG16 model. Luckily there is a package TensorFlow-Slim included in the TensorFlow installation, which allows to easily build networks.
# Definition of the network
library(magrittr)
# The last layer is the fc8 Tensor holding the logits of the 1000 classes
fc8 = slim$conv2d(imgs_scaled, 64, shape(3,3), scope='vgg_16/conv1/conv1_1') %>%
slim$conv2d(64, shape(3,3), scope='vgg_16/conv1/conv1_2') %>%
slim$max_pool2d( shape(2, 2), scope='vgg_16/pool1') %>%
slim$conv2d(128, shape(3,3), scope='vgg_16/conv2/conv2_1') %>%
slim$conv2d(128, shape(3,3), scope='vgg_16/conv2/conv2_2') %>%
slim$max_pool2d( shape(2, 2), scope='vgg_16/pool2') %>%
slim$conv2d(256, shape(3,3), scope='vgg_16/conv3/conv3_1') %>%
slim$conv2d(256, shape(3,3), scope='vgg_16/conv3/conv3_2') %>%
slim$conv2d(256, shape(3,3), scope='vgg_16/conv3/conv3_3') %>%
slim$max_pool2d(shape(2, 2), scope='vgg_16/pool3') %>%
slim$conv2d(512, shape(3,3), scope='vgg_16/conv4/conv4_1') %>%
slim$conv2d(512, shape(3,3), scope='vgg_16/conv4/conv4_2') %>%
slim$conv2d(512, shape(3,3), scope='vgg_16/conv4/conv4_3') %>%
slim$max_pool2d(shape(2, 2), scope='vgg_16/pool4') %>%
slim$conv2d(512, shape(3,3), scope='vgg_16/conv5/conv5_1') %>%
slim$conv2d(512, shape(3,3), scope='vgg_16/conv5/conv5_2') %>%
slim$conv2d(512, shape(3,3), scope='vgg_16/conv5/conv5_3') %>%
slim$max_pool2d(shape(2, 2), scope='vgg_16/pool5') %>%
slim$conv2d(4096, shape(7, 7), padding='VALID', scope='vgg_16/fc6') %>%
slim$conv2d(4096, shape(1, 1), scope='vgg_16/fc7') %>%
# Setting the activation_fn=NULL does not work, so we get a ReLU
slim$conv2d(1000, shape(1, 1), scope='vgg_16/fc8') %>%
tf$squeeze(shape(1, 2), name='vgg_16/fc8/squeezed')

We can visualize the model in tensorboard, by saving the default graph via:
tf$train$SummaryWriter('/tmp/dumm/vgg16', tf$get_default_graph())$close()

You can now open a shell and start tensorboard
tensorboard --logdir /tmp/dumm/

You should get a result like:

We now define the model. Note that since the network is written in the default graph, we do a clean start by resetting the default graph.

library(tensorflow) slim = tf$contrib$slim #Poor mans import tensorflow.contrib.slim as slim tf$reset_default_graph() # Better to start from scratch

We start with a placeholder tensor in which we later feed the images. The model works on a batch of images and thus needs a tensor of order 4 (an array having 4 indices). The first index of the tensor counts the image number and the second to 4th index is for the width, height, color. Since we want to allow for an arbitrary number of images of arbitrary size, we leave these dimensions open. We only specify that there should be 3 color channels (rgb). Then these images are rescaled with TensorFlow to the size (224, 224) as needed by the network.

# Resizing the images images = tf$placeholder(tf$float32, shape(NULL, NULL, NULL, 3)) imgs_scaled = tf$image$resize_images(images, shape(224,224))

We are now defining the VGG16 model. Luckily there is a package TensorFlow-Slim included in the TensorFlow installation, which allows to easily build networks.

# Definition of the network library(magrittr) # The last layer is the fc8 Tensor holding the logits of the 1000 classes fc8 = slim$conv2d(imgs_scaled, 64, shape(3,3), scope='vgg_16/conv1/conv1_1') %>% slim$conv2d(64, shape(3,3), scope='vgg_16/conv1/conv1_2') %>% slim$max_pool2d( shape(2, 2), scope='vgg_16/pool1') %>% slim$conv2d(128, shape(3,3), scope='vgg_16/conv2/conv2_1') %>% slim$conv2d(128, shape(3,3), scope='vgg_16/conv2/conv2_2') %>% slim$max_pool2d( shape(2, 2), scope='vgg_16/pool2') %>% slim$conv2d(256, shape(3,3), scope='vgg_16/conv3/conv3_1') %>% slim$conv2d(256, shape(3,3), scope='vgg_16/conv3/conv3_2') %>% slim$conv2d(256, shape(3,3), scope='vgg_16/conv3/conv3_3') %>% slim$max_pool2d(shape(2, 2), scope='vgg_16/pool3') %>% slim$conv2d(512, shape(3,3), scope='vgg_16/conv4/conv4_1') %>% slim$conv2d(512, shape(3,3), scope='vgg_16/conv4/conv4_2') %>% slim$conv2d(512, shape(3,3), scope='vgg_16/conv4/conv4_3') %>% slim$max_pool2d(shape(2, 2), scope='vgg_16/pool4') %>% slim$conv2d(512, shape(3,3), scope='vgg_16/conv5/conv5_1') %>% slim$conv2d(512, shape(3,3), scope='vgg_16/conv5/conv5_2') %>% slim$conv2d(512, shape(3,3), scope='vgg_16/conv5/conv5_3') %>% slim$max_pool2d(shape(2, 2), scope='vgg_16/pool5') %>% slim$conv2d(4096, shape(7, 7), padding='VALID', scope='vgg_16/fc6') %>% slim$conv2d(4096, shape(1, 1), scope='vgg_16/fc7') %>% # Setting the activation_fn=NULL does not work, so we get a ReLU slim$conv2d(1000, shape(1, 1), scope='vgg_16/fc8') %>% tf$squeeze(shape(1, 2), name='vgg_16/fc8/squeezed')

We can visualize the model in tensorboard, by saving the default graph via:

tf$train$SummaryWriter('/tmp/dumm/vgg16', tf$get_default_graph())$close()

You can now open a shell and start tensorboard

tensorboard --logdir /tmp/dumm/

You should get a result like:

## Loading the weights

#
We start a Session and restore the model weights from the downloaded weight file.
restorer = tf$train$Saver()
sess = tf$Session()
restorer$restore(sess, '/Users/oli/Dropbox/server_sync/tf_slim_models/vgg_16.ckpt')

We start a Session and restore the model weights from the downloaded weight file.

restorer = tf$train$Saver() sess = tf$Session() restorer$restore(sess, '/Users/oli/Dropbox/server_sync/tf_slim_models/vgg_16.ckpt')

## Loading the images

#
Now it’s time to load the image. The values have to be in the range of 0 to 255. Therefore I multiply the values by 255. Further, we need to feed the placeholder Tensor with an array of order 4.
library(jpeg)
img1 <- readJPEG('apple.jpg')
d = dim(img1)
imgs = array(255*img1, dim = c(1, d[1], d[2], d[3])) #We need array of order 4

Now it’s time to load the image. The values have to be in the range of 0 to 255. Therefore I multiply the values by 255. Further, we need to feed the placeholder Tensor with an array of order 4.

library(jpeg) img1 <- readJPEG('apple.jpg') d = dim(img1) imgs = array(255*img1, dim = c(1, d[1], d[2], d[3])) #We need array of order 4

## Feeding and fetching the graph

#
Now we have a graph in the session with the correct weights. We can do the predictions by feeding the placeholder tensor `images`

with the value of the images stored in the array `imgs`

. We fetch the `fc8`

tensor from the graph and store it in `fc8_vals`

.
fc8_vals = sess$run(fc8, dict(images = imgs))
fc8_vals[1:5] #In python [-2.86833096 0.7060132 -1.32027602 -0.61107934 -1.67312801]
## [1] 0.0000000 0.7053483 0.0000000 0.0000000 0.0000000

When comparing it with the python result, we see that negative values are clamped to zero. This is due to the fact that in this R implementation I could not deactivate the final ReLu operation. Nevertheless, we are only interested in the positive values which we transfer to probabilities for the certain classes via
probs = exp(fc8_vals)/sum(exp(fc8_vals))

We sort for the highest probabilities and also load the descriptions of the image net classes and produce the final plot.
idx = sort.int(fc8_vals, index.return = TRUE, decreasing = TRUE)$ix[1:5]
# Reading the class names
library(readr)
names = read_delim("imagenet_classes.txt", "\t", escape_double = FALSE, trim_ws = TRUE,col_names = FALSE)
### Graph
library(grid)
g = rasterGrob(img1, interpolate=TRUE)
text = ""
for (id in idx) {
text = paste0(text, names[id,][[1]], " ", round(probs[id],5), "\n")
}
library(ggplot2)
ggplot(data.frame(d=1:3)) + annotation_custom(g) +
annotate('text',x=0.05,y=0.05,label=text, size=7, hjust = 0, vjust=0, color='blue') + xlim(0,1) + ylim(0,1)

Now since we can load trained models, we can do many cool things like transfer learning etc. More maybe another time.

Now we have a graph in the session with the correct weights. We can do the predictions by feeding the placeholder tensor

`images`

with the value of the images stored in the array `imgs`

. We fetch the `fc8`

tensor from the graph and store it in `fc8_vals`

.fc8_vals = sess$run(fc8, dict(images = imgs)) fc8_vals[1:5] #In python [-2.86833096 0.7060132 -1.32027602 -0.61107934 -1.67312801] ## [1] 0.0000000 0.7053483 0.0000000 0.0000000 0.0000000

When comparing it with the python result, we see that negative values are clamped to zero. This is due to the fact that in this R implementation I could not deactivate the final ReLu operation. Nevertheless, we are only interested in the positive values which we transfer to probabilities for the certain classes via

probs = exp(fc8_vals)/sum(exp(fc8_vals))

We sort for the highest probabilities and also load the descriptions of the image net classes and produce the final plot.

idx = sort.int(fc8_vals, index.return = TRUE, decreasing = TRUE)$ix[1:5] # Reading the class names library(readr) names = read_delim("imagenet_classes.txt", "\t", escape_double = FALSE, trim_ws = TRUE,col_names = FALSE) ### Graph library(grid) g = rasterGrob(img1, interpolate=TRUE) text = "" for (id in idx) { text = paste0(text, names[id,][[1]], " ", round(probs[id],5), "\n") } library(ggplot2) ggplot(data.frame(d=1:3)) + annotation_custom(g) + annotate('text',x=0.05,y=0.05,label=text, size=7, hjust = 0, vjust=0, color='blue') + xlim(0,1) + ylim(0,1)

Now since we can load trained models, we can do many cool things like transfer learning etc. More maybe another time.

To

**leave a comment**for the author, please follow the link and comment on their blog:**Random Thoughts on R**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.