Rounding in R

June 15, 2012
By

(This article was first published on Statistically Significant, and kindly contributed to R-bloggers)

Forgive me if you are already aware of this, but I found it quite alarming. I know that most code is interpreted by the computer in binary and we input in decimal, so problems can arise in conversion and with floating point. But the example I have below is so simple that it really surprised me.

I was converting a function from R into MATLAB so that a colleague could use it. I tested it out on the same data and got slightly different results. Digging into the problem, the difference was due to the fact that R was rounding 4.5 to 4 and MATLAB was rounding it to 5. I thought the "4.5" must have really been "4.49999...". But that was not so.

For example, this is the result of the round function for a few numbers.
> round(0.5,0)
[1] 0
> round(1.5,0)
[1] 2
> round(2.5,0)
[1] 2
> round(3.5,0)
[1] 4
> round(4.5,0)
[1] 4
> round(5.5,0)
[1] 6
> round(6.5,0)
[1] 6

Do you see a pattern?

I tried this on versions 2.13.1 and 2.14.0. I ran the same with MATLAB and it gave the expected results. I am not any kind of expert on computer sciences, so I was not sure why this is happening. Converting any decimal number that ends in .5 into binary results in a finite length binary number. For example, 4.5 is 100.1 in binary. Because of this, I wouldn't think the error would be due to floating points, but I really don't know.

Looking at the documentation for round, I found the reason. It states in the notes, "Note that for rounding off a 5, the IEC 60559 standard is expected to be used, ‘go to the even digit’." It is a little comforting knowing that there is a logic behind it and that R is abiding to some standard. But why isn't MATLAB abiding by the same standard? Also, I think most people expect numbers ending in .5 to round up, not the nearest even digit.

To leave a comment for the author, please follow the link and comment on his blog: Statistically Significant.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.