Some simple facial analytics on actors (and my manager)

[This article was first published on Longhow Lam's Blog » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Some time ago I was at a party, inevitably, a question that came up was: “Longhow what kind of work are you doing?” I answered: I am a data scientist I have the most sexy job, do you want me to show you how to use deep learning for facial analytics…… Oops, it became very quiet. Warning: don’t google on the words sexy deep and facial at your work!

But anyway, I was triggered by the Face++ website and thought can we do some simple facial analytics? The Face++ site offers an API where you can upload images (or use links to images). For each image that you upload the Face++ engine can detect if there are one or more faces. If a face is detected, you can also let the engine return 83 facial landmarks. Example landmarks are nose_tip, left_eye_left corner, contour_chin, see the picture below. So an image is now reduced to 83 landmarks.

face_WP

What are some nice faces to analyse? Well, I just googled on “top 200 actors” and got a nice list of actor names, and with the permission of my manager Rein Mertens, I added his picture as well. So what steps did I do next:

Data preparation

1. I have downloaded import.io, this a very handy tool to turn web pages into data. In the tool enter the link to the list of 200 actors. It will then extract all the links to the 200 pictures (together with more data) and conveniently export it to a csv file.

import.io from web page to data

Click to enlarge

2. I wrote a small script in R to import the csv file from step 1, so for each of the 200 actors I have the link to his image. Now I can call the face++ api for every link and let Face++ return the 83 facial landmarks. The result is a data set with 200 rows and 166 columns (x and y position of a landmark) and a ID column containing the name of the actor.

3. I added the 83 land marks of my manager to the data set. So my set now looks like

Analytical base table of actor faces

Click to enlarge

There were some images where Face++ could not detect a face, I removed those images.

nofaces

faces not recognized by the Face++ engine

Now that I have the so-called facial analytical base table of actors I can apply some nice analytics

Deep learning, autoencoders

What are some strange faces? To answer that with an objective point of view, I used deep learning autoencoders. The idea of an autoencoder is to learn a compressed representation for a set of data typically for the purpose of dimensionality reduction (Wikipedia). The trick is to use a neural network where the output target is the same as the input variables. I have actor faces which are observations in a 166 dimensional space, using proc neural in SAS I have reduced the faces to points in a two dimensional space. I have used 5 hidden layers where the middle layer consist of two neurons, as shown in the sas code snippet below. More information on auto encoders in SAS can be found here.

procneural

The faces can now be plotted in a scatter plot, (as shown below) the two dimensions corresponding to the middle layer.

scatter plot of actor faces

click to enlarge

A more interactive version of this plot can be found here, it is a simple D3.js script where you can hover over the dots and click to see the underlying picture.

Hierarchical Cluster analysis

Can we group together faces that are similar (similar facial landmarks)? Yes, we can apply a cluster technique to do that. There are many different techniques, I have used an agglomerative hierarchical cluster technique because the algorithm that is used to cluster the faces can be nicely visualized in a so called dendrogram. The algorithm starts with each observation (face) as its own cluster, in each following iteration pairs of clusters are merged until all observations form one cluster. In SAS you can use proc cluster to perform hierarchical clustering, see the code and dendrogram below.

proccluster

dendrogram faces

Click to enlarge

Using D3.js I have also created a more interactive dendrogram.

— L. —

.


To leave a comment for the author, please follow the link and comment on their blog: Longhow Lam's Blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)