How to build an image recognizer in R using just a few images

August 16, 2017
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Microsoft Cognitive Services provides several APIs for image recognition, but if you want to build your own recognizer (or create one that works offline), you can use the new Image Featurizer capabilities of Microsoft R Server. 

The process of training an image recognition system requires LOTS of images — millions and millions of them. The process involves feeding those images into a deep neural network, and during that process the network generates "features" from the image. These features might be versions of the image including just the outlines, or maybe the image with only the green parts. You could further boil those features down into a single number, say the length of the outline or the percentage of the image that is green. With enough of these "features", you could use them in a traditional machine learning model to classify the images, or perform other recognition tasks.

But if you don't have millions of images, it's still possible to generate these features from a model that has already been trained on millions of images. ResNet is a very deep neural network model trained for the task of image recognition which has been used to win major computer-vision competitions. With the rxFeaturize function in Microsoft R Client and Microsoft R Server, you can generate 4096 features from this model on any image you provide. The features themselves are meaningful only to a computer, but that vector of 4096 numbers between zero and one is (ideally) a distillation of the unique characteristics of that image as a human would recognize it. You can then use that features vector to create your own image-recognition system without the burden of training your own neural network on a large corpus of images.

On the Cortana Intelligence and ML blog, Remko de Lange provides a simple example: given a collection of 60-or-so images of chairs like those below, how can you find the image that most looks like a deck chair?

Chairs

First, you need a representative image of a deck chair:

Chair56

Then, you calculate the features vector for that image using rxFeaturize

imageToMatchDF <- rxFeaturize(
  data = data.frame(Image="deck-chair.jpg"),
  mlTransforms = list(
    loadImage(vars = list(Features = "Image")),
    resizeImage(vars = "Features", width = 227, height = 227),
    extractPixels(vars = "Features"),
    featurizeImage(var = "Features", dnnModel = "alexnet")   
  ))

Note that when featurizing an image, you need to shrink it down to a specified size (the built-in function resizeImage handles that). There are also several pretrained models to choose from: three variants of ResNet and also Alexnet, which we use here. This gives us a features vector of 4096 numbers to represent our deck chair image. Then, we just need to use the same process to calculate the features vector for our 60 other images, and find the one that's closest to our deck chair. We can simply use the dist function in R to do that, for example, and that's exactly what Remko's script on Github does. The image with the closest features vector to our representative image is this one:

Chair43

So, even with a relatively small collection of images, it's possible to build a powerful image recognition system using image featurization and the powerful image recognition neural networks provided with Microsoft R. The complete code and sample images used in this example are available on Github. (Note, you'll need to have a license for Microsoft R Server or install the free Microsoft R Client with the pretrained models option to use the image featurization functions in the script.) And for more details on creating this recognizer, check out the blog post below.

Cortana Intelligence and Machine Learning Blog: Find Images With Images

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)