Handwriting recognition using R

December 17, 2011
By

(This article was first published on Yixuan's Blog - R, and kindly contributed to R-bloggers)

This title is a bit exaggerating since handwriting recognition is an advanced topic
in machine learning involving complex techniques and algorithms. In this blog I’ll
show you a simple demo illustrating how to recognize a single number (0 ~ 9) using R.
The overall process is that, you draw a number in a graphics device in R using your mouse,
and then the program will “guess” what you have input. It is just for FUN.

There are two major problems in this number recognition problem, that
is, how to describe the trace of your handwriting, and how to classify
this trace to the give classes (0 ~ 9).

For the first question, we could first detect the motion of your mouse
in the graphics device, and then record the coordinates of you mouse
cursor at a sequence of time points. This could be done via the
getGraphicsEvent() function in grDevices package. For example, after I
drew a number 2 in the graphics window like below, the coordinates of
each point in the trace were assigned to a pair of variables px and py.

Record trace

The scatterplot of px and py versus their orders in the trace is
shown below.

Record points

To be comparable among different traces, we normalize the Order to be
within (0, 1] (that is, transform 1, 2, …, n to 1/n, 2/n, …, 1).
Also, since this recording is discrete but the real trace should be
continuous, we use the spline() function to interpolate at unknown
points, resulting in the following figure.

Record splines

The dots in the figure have normalized orders of 0.02, 0.04,
0.06, …, 1, at which the x and y coordinates are obtained by
interpolation. Therefore, we could use $r = (x, y)$ where
$x = (x_1, x_2, …, x_{50})’$ and $y = (y_1, y_2, …, y_{50})’$ to
represent the information of the number 2 I have drawn. Somewhat
confused by the operations above? Well, the idea behind this
normalization and interpolation is simple: use 50 “uniformly
ordered” points (I call them “recording points”) to represent the trace.

So it comes to the second question – given a trace, how to classify
it? Obviously we first need a training set, the recording points of
number 0 to number 9 generated as above. Then we’ll compare the
given trace with each one in the training set and find out which
number resembles it most.

Several criteria could be used to measure the similarity, but some
important rules should be considered. We still use $r = (x, y)$ to
represent the recording points of a trace, and use $Sim(r_1, r_2)$ to
stand for the similarity between two traces. Notice that this
similarity should not be sensitive to the scale and location of
traces. That is, if I draw a number in another location in the
window, or in a larger or smaller size, the recognition should not be
influenced. In mathematics, this could be expressed by

where $k_1 > 0$, $k_2 > 0$, $b_1$, $b_2$ are real numbers.

In my code, I simply define the similarity as the sum of Pearson
correlation coefficients of x and y, that is,

The whole source code is (note that I use 500 recording points
instead of 50):

library(grid);
getData = function()
{
    if(.Platform$OS.type == 'windows') x11() else x11(type = 'Xlib');
    pushViewport(viewport());
    grid.rect();
    px = NULL;
    py = NULL;
    mousedown = function(buttons, x, y)
    {
        if(length(buttons) > 1 || identical(buttons, 2L))
            return(invisible(1));
        eventEnv$onMouseMove = mousemove;
        NULL
    }
    mousemove = function(buttons, x, y)
    {
        px <<- c(px, x);
        py <<- c(py, y);
        grid.points(x, y);
        NULL
    }
    mouseup = function(buttons, x, y) {
        eventEnv$onMouseMove = NULL;
        NULL
    }
    setGraphicsEventHandlers(onMouseDown = mousedown,
                             onMouseUp = mouseup);
    eventEnv = getGraphicsEventEnv();
    cat("Click down left mouse button and drag to draw the number,
        right click to finish.n");
    getGraphicsEvent();
    dev.off();
    s = seq(0, 1, length.out = length(px));
    spx = spline(s, px, n = 500)$y;
    spy = spline(s, py, n = 500)$y;
    return(cbind(spx, spy));
}
traceCorr = function(dat1, dat2)
{
    cor(dat1[, 1], dat2[, 1]) + cor(dat1[, 2], dat2[, 2]);
}

# Please set the proper path of this file.
load("train.RData");
guess = function(verbose = FALSE)
{
    test = getData();
    coefs = sapply(recogTrain, traceCorr, dat2 = test);
    num = which.max(coefs);
    if(num == 10) num = 0;
    if(verbose) print(coefs);
    cat("I guess what you have input is ", num, ".n", sep = "");
}
guess();

To run the code, you must load the “training set”, the file
train.RData, into R using the load() function, and then call
guess() to play with it.

Have fun!

Download: Source code and training dataset

To leave a comment for the author, please follow the link and comment on their blog: Yixuan's Blog - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)