# Eigensheep

March 13, 2011
By

(This article was first published on Edwin Chen's Blog » r, and kindly contributed to R-bloggers)

Aaron Koblin’s Sheep Market visualization is an awesome use of Mechanical Turk. But it’d be even more awesome if the grid were ordered, so inspired by the use of eigenfaces in facial recognition, I decided to try projecting the sheep onto two dimensions.

# Principal Sheep Components

After screenshotting the first 50 sheep from the market and normalizing their size and color, here’s what a PCA projection looks like (click for a larger view):

Notice how the stroke widths get thicker as we move to the right (i.e., the first principal component seems to measure the blackness of the sheep), and the amount of wool on the sheep’s body increases as we move up (i.e., the second principal component seems to measure the wooliness of the sheep).

It’s also pretty neat how all the sheep with black heads and black legs (sheep 35, 16, 32, 31, and 19) get clumped together:

And I think the sheep on the left (next to and inside the dense cluster) seem much more poorly drawn — they look more like camels, dogs, unicorns, or bugs than actual sheep.

# Code

In a bit more detail, I used the poor man’s Mechanical Turk (myself) to screenshot the first 50 sheep from the market, trying to hug the sheep as closely as possible to ensure proper alignment. Next, I used the Python Imaging Library to resize the images to 150x150px, convert them to grayscale, and flatten them into the rows of a matrix.

In case anyone else wants to play with the sheep images, I put the code on my Github account.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...