Handwriting recognition using R

[This article was first published on Yixuan's Blog - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This title is a bit exaggerating since handwriting recognition is an advanced topic in machine learning involving complex techniques and algorithms. In this blog I’ll show you a simple demo illustrating how to recognize a single number (0 ~ 9) using R. The overall process is that, you draw a number in a graphics device in R using your mouse, and then the program will “guess” what you have input. It is just for FUN.

There are two major problems in this number recognition problem, that is, how to describe the trace of your handwriting, and how to classify this trace to the give classes (0 ~ 9).

For the first question, we could first detect the motion of your mouse in the graphics device, and then record the coordinates of you mouse cursor at a sequence of time points. This could be done via the getGraphicsEvent() function in grDevices package. For example, after I drew a number 2 in the graphics window like below, the coordinates of each point in the trace were assigned to a pair of variables px and py.

Record trace

The scatterplot of px and py versus their orders in the trace is shown below.

Record points

To be comparable among different traces, we normalize the Order to be within (0, 1] (that is, transform 1, 2, …, n to 1/n, 2/n, …, 1). Also, since this recording is discrete but the real trace should be continuous, we use the spline() function to interpolate at unknown points, resulting in the following figure.

Record splines

The dots in the figure have normalized orders of 0.02, 0.04, 0.06, …, 1, at which the x and y coordinates are obtained by interpolation. Therefore, we could use $r = (x, y)$ where $x = (x_1, x_2, …, x_{50})’$ and $y = (y_1, y_2, …, y_{50})’$ to represent the information of the number 2 I have drawn. Somewhat confused by the operations above? Well, the idea behind this normalization and interpolation is simple: use 50 “uniformly ordered” points (I call them “recording points”) to represent the trace.

So it comes to the second question – given a trace, how to classify it? Obviously we first need a training set, the recording points of number 0 to number 9 generated as above. Then we’ll compare the given trace with each one in the training set and find out which number resembles it most.

Several criteria could be used to measure the similarity, but some important rules should be considered. We still use $r = (x, y)$ to represent the recording points of a trace, and use $Sim(r_1, r_2)$ to stand for the similarity between two traces. Notice that this similarity should not be sensitive to the scale and location of traces. That is, if I draw a number in another location in the window, or in a larger or smaller size, the recognition should not be influenced. In mathematics, this could be expressed by

where $k_1 > 0$, $k_2 > 0$, $b_1$, $b_2$ are real numbers.

In my code, I simply define the similarity as the sum of Pearson correlation coefficients of x and y, that is,

The whole source code is (note that I use 500 recording points instead of 50):

library(grid);
getData = function()
{
    if(.Platform$OS.type == 'windows') x11() else x11(type = 'Xlib');
    pushViewport(viewport());
    grid.rect();
    px = NULL;
    py = NULL;
    mousedown = function(buttons, x, y)
    {
        if(length(buttons) > 1 || identical(buttons, 2L))
            return(invisible(1));
        eventEnv$onMouseMove = mousemove;
        NULL
    }
    mousemove = function(buttons, x, y)
    {
        px <<- c(px, x);
        py <<- c(py, y);
        grid.points(x, y);
        NULL
    }
    mouseup = function(buttons, x, y) {
        eventEnv$onMouseMove = NULL;
        NULL
    }
    setGraphicsEventHandlers(onMouseDown = mousedown,
                             onMouseUp = mouseup);
    eventEnv = getGraphicsEventEnv();
    cat("Click down left mouse button and drag to draw the number,
        right click to finish.n");
    getGraphicsEvent();
    dev.off();
    s = seq(0, 1, length.out = length(px));
    spx = spline(s, px, n = 500)$y;
    spy = spline(s, py, n = 500)$y;
    return(cbind(spx, spy));
}
traceCorr = function(dat1, dat2)
{
    cor(dat1[, 1], dat2[, 1]) + cor(dat1[, 2], dat2[, 2]);
}

# Please set the proper path of this file.
load("train.RData");
guess = function(verbose = FALSE)
{
    test = getData();
    coefs = sapply(recogTrain, traceCorr, dat2 = test);
    num = which.max(coefs);
    if(num == 10) num = 0;
    if(verbose) print(coefs);
    cat("I guess what you have input is ", num, ".n", sep = "");
}
guess();

To run the code, you must load the “training set”, the file train.RData, into R using the load() function, and then call guess() to play with it.

Have fun!

Download: Source code and training dataset

To leave a comment for the author, please follow the link and comment on their blog: Yixuan's Blog - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)