Deep Learning for Brand Logo Detection – part II

[This article was first published on Florian Teschner, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A month ago, I started playing with the deep learning framework Keras for R. As a use-case I picked logo detection in images. While the training of a net worked out fine, the results were mediocre. (Check out the full post to for details on the model and the setup.) Here is the recap of the outcome; training the model on the Flickr27-dataset, with only 270 images of 27 classes, the validation accuracy came out at 15%.

plot last session

However, there are at least two points to remember; the dataset is really small -especially to train a deep neural network, and some brands are hard to recognize, even for the human eye.

In order to improve on the use-case (logo detection in advertisements), I downloaded 20 additional images for each brand logo from google image search using a rough script.

Equipped with the enhanced dataset, I trained the model again, using the same model setup as in the last post but with slightly different parameters.

plot of chunk unnamed-chunk-8

The effect is visible; the validation accuracy jumps up to 44%. One thing that I also improved on is a more experimental approach to parameter variation.

In the grid search approach, I varied the learning (lr), image size and input filter parameters. Plotting the validation accuracy over the 200 epochs, we see a high variation in the achieved accuracy. Some parameters combinations do not lead the model to learn information about images, while other variations quite quickly converge to good results.

plot of chunk unnamed-chunk-9

What I still find confusing is that the parameters seem to have an interaction effect on model convergence and performance. While the learning rate of 0.001 works well with 32 pixel image sizes, it completely fails in the case of 64 pixel images. The even lower learning rate of 1e-04 (0.0001) at least moves constantly upwards in all cases but much more so in the case of 64 pixel images.

In order to improve on the existing model, I plan to further augment the training dataset, and finally use a pre-trained image model.

For reference, this is the setup from the previous post:

To leave a comment for the author, please follow the link and comment on their blog: Florian Teschner.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)