Convolutional Neural Network under the Hood

[This article was first published on R – Hi! I am Nagdev, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Neural networks have really taken over for solving image recognition and high sample rate data problems in the last couple of years. In all honesty, I promise I won’t be teaching you what neural networks are or CNN’s are. There are hundred’s of resources that are published everyday explaining them. I’ll post few links below.

I am a serious R user and very new to Deep learning domain. As I started coming across new image classification projects, I started to incline towards CNN’s. I went over few tutorials regarding image classification using CNN’s and read a few books. After a few, I started to see the same old pattern in every blog post

  1. Download data set
  2. Split them three ways (train/test/validation)
  3. Create a model (in most cases pre-trained models)
  4. Set up generators
  5. Compile the model
  6. Predict
  7. End

I understood the concept of filter, filter size and activation functions. But, I was curious on what the network was actually seeing through the filter. I did a lot of digging and found a stackoverflow post linking to RStudio’s Keras-FAQ. It was literally 3 lines of code to visualize what was happening at each layer. Meanwhile in python it was over two dozen lines of code. (Irony!) I thought there might be quite a few people out there who would be interesting in knowing this in R just like me. So, I decided to write this blog post. It would be very useful when you are explaining this to your boss or a work colleague.

Let’s get started!

Initial Setup

Downloading Data Set

For this example, I will be using cats and dogs data set from Kaggle. You can follow the link and download the data. You might have to create an account to download it.

If you have your own data then don’t worry about this step. Skip it.

Load Keras library


Split the data into train and test

The below code is courtesy of Rstudio blog. 

original_dataset_dir = "/home/rstudio/train"

base_dir = "/home/rstudio/data"

train_dir = file.path(base_dir, "train")

test_dir = file.path(base_dir, "test")

train_cats_dir = file.path(train_dir, "cats")

train_dogs_dir = file.path(train_dir, "dogs")

test_cats_dir = file.path(test_dir, "cats")

test_dogs_dir = file.path(test_dir, "dogs")

fnames = paste0("cat.", 1:2000, ".jpg")
file.copy(file.path(original_dataset_dir, fnames), 

fnames = paste0("cat.", 2001:3000, ".jpg")
file.copy(file.path(original_dataset_dir, fnames),
          file.path(test_cats_dir)) fnames = paste0("dog.", 1:2000, ".jpg")
file.copy(file.path(original_dataset_dir, fnames),
          file.path(train_dogs_dir)) fnames = paste0("dog.", 2001:3000, ".jpg")
file.copy(file.path(original_dataset_dir, fnames),

Set initial parameters

I am creating few variables and assigning them values here. The main reason is, it’s easy to tweak them and retrain the models.

# set path 
path = "/home/rstudio/data/" 

# set inital parameters 
img_width = 150 
img_height = 150 
channels = 3 
output_n = 2 
train_samples = length(list.files(paste0(path,"train/cats"))) + length(list.files(paste0(path,"train/dogs"))) 
test_samples = length(list.files(paste0(path,"test/cats"))) + length(list.files(paste0(path,"test/dogs"))) 
batch_size = 50 

# set dataset directory 
train_dir = paste0(path,"train") 
test_dir = paste0(path,"test")

Create a custom mode

I could use a pre-trained model such as VGG16 or VGG18. But, what’s the fun in that? Let me build my own. Don’t judge me about bad layering. I am still learning.

# CNN model 
model = keras_model_sequential() %>% 
layer_conv_2d(filters = 8, kernel_size = c(3,3), activation = "relu", input_shape = c(img_width,img_height,channels)) %>% 
layer_conv_2d(filters = 16, kernel_size = c(3,3), activation = "relu") %>% 
layer_max_pooling_2d(pool_size = c(2,2)) %>% layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = "relu") %>% 
layer_max_pooling_2d(pool_size = c(2,2)) %>% layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = "relu") %>% 
layer_max_pooling_2d(pool_size = c(2,2)) %>% layer_conv_2d(filters = 16, kernel_size = c(3,3), activation = "relu") %>% 
layer_flatten() %>% 
layer_dense(units = 64, activation = "relu") %>% 
layer_dense(units = 64, activation = "relu") %>% 
layer_dropout(rate = 0.3) %>% 
layer_dense(units = 128, activation = "relu") %>% 
layer_dense(units = 128, activation = "relu") %>% 
layer_dropout(rate = 0.3) %>% 
layer_dense(units = 256, activation = "relu") %>% 
layer_dense(units = 256, activation = "relu") %>% 
layer_dropout(rate = 0.3) %>% 
layer_dense(units = 64, activation = "relu") %>% 
layer_dense(units = 64, activation = "relu") %>% 
layer_dropout(rate = 0.3) %>% 
layer_dense(units = 32, activation = "relu") %>% 
layer_dense(units = output_n, activation = "softmax") 

# summary of the overall model 

Image processing

Setup image augmentation

When your data set is small, augmentation helps in increasing your own data set. Here we have few parameters like rotation, shift and zoom that would be added to your current train set to increase your train size.

# Train data image preprocessing
datagen = image_data_generator(
                               rotation_range = 40,
                               width_shift_range = 0.2,
                               height_shift_range = 0.2,
                               shear_range = 0.2,
                               zoom_range = 0.2,
                               horizontal_flip = TRUE,
                               fill_mode = "nearest",
                               samplewise_std_normalization = TRUE

Setup image generators

Flow from image directory really helps in easing up the pre-processing. In the previous step we put our images into separate directories based on classes. Now this function would read the images from as per each class. No need to create any metadata.

# get all the train set 
train_generator = flow_images_from_directory( train_dir, 
                                              color_mode = "rgb", 
                                              target_size = c(img_width, img_height), 
                                              batch_size = batch_size, 
                                              class_mode = "categorical", shuffle = TRUE )

# Get test data set 
test_generator = flow_images_from_directory( test_dir, datagen, color_mode = "rgb", target_size = c(img_width, img_height), batch_size = batch_size, class_mode = "categorical", shuffle = TRUE )

Compile and fit the model

Now, that we have a model and generators, we can compile the model and fit the generator. I ran the model at 100 epochs couple of times and achieved an average accuracy of 80%. Not too bad for this test model!.

# compile the model
model %>% compile(
                  loss = "binary_crossentropy",
                  optimizer = optimizer_adamax(lr = 0.001, decay = 0),
                  metrics = c("accuracy")
history = model %>% fit_generator(
                                  steps_per_epoch = as.integer(train_samples/batch_size),
                                  epochs = 100,
                                  validation_data = test_generator,
                                  validation_steps = 10,
                                  initial_epoch = 1

# load image
x = image_load(paste0(path,"test/cats/cat.2001.jpg"),target_size =  c(img_width, img_height)) 
data = x %>% array_reshape(c(-1,img_width, img_height, channels))

image = jpeg::readJPEG(paste0(path,"test/cats/cat.2511.jpg"))

  Next, we will capture an intermediate layer, save that layer as model, predict our image based on intermediate later. We will get a multidimensional matrix output. In the below results we have an image size of 33 x 33 and 64 filters. You can tweak them to plot the results.   Note: Index is the layer number that we want to look at. 

# what layer do we want to look at?
index = 6

# choose that layer as model
intermediate_layer_model = keras_model(inputs = model$input,
                                        outputs = get_layer(model, index = index)$output)

# predict on that layer
intermediate_output = predict(intermediate_layer_model, data)

# dimensionso of prediction
[1]  1 33 33 64

Finally, we can plot our matrix data from each of our filters into a grid using image function as shown below.  

Note: the images below are rotated. You can rotate the images using matrix rotate function. 

par(mfrow = c(3,3))
for(i in 1:9){

Layer 2


Layer 3


Layer 6



From the above you can see how the CNN filters are narrowing the point of interest to the cat. This not only helps explain how your model is working but, also a way to confirm that your model is working like it is intended to. It was quite a journey for me to go through the inner webs to find a way to visualize my layers. Hope you’all can use this for your projects.

Links to tutorials

  1. Shirin’s playgRound
  2. Rstudio Blog
  3. Rstudio Tensorflow
  4. R-Bloggers

The post Convolutional Neural Network under the Hood appeared first on Hi! I am Nagdev.

To leave a comment for the author, please follow the link and comment on their blog: R – Hi! I am Nagdev. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)