**Statistically Significant**, and kindly contributed to R-bloggers)

Forgive me if you are already aware of this, but I found it quite alarming. I know that most code is interpreted by the computer in binary and we input in decimal, so problems can arise in conversion and with floating point. But the example I have below is so simple that it really surprised me.

I was converting a function from R into MATLAB so that a colleague could use it. I tested it out on the same data and got slightly different results. Digging into the problem, the difference was due to the fact that R was rounding 4.5 to 4 and MATLAB was rounding it to 5. I thought the “4.5” must have really been “4.49999…”. But that was not so.

For example, this is the result of the round function for a few numbers.

> round(0.5,0)

[1] 0

> round(1.5,0)

[1] 2

> round(2.5,0)

[1] 2

> round(3.5,0)

[1] 4

> round(4.5,0)

[1] 4

> round(5.5,0)

[1] 6

> round(6.5,0)

[1] 6

Do you see a pattern?

I tried this on versions 2.13.1 and 2.14.0. I ran the same with MATLAB and it gave the expected results. I am not any kind of expert on computer sciences, so I was not sure why this is happening. Converting any decimal number that ends in .5 into binary results in a finite length binary number. For example, 4.5 is 100.1 in binary. Because of this, I wouldn’t think the error would be due to floating points, but I really don’t know.

Looking at the documentation for round, I found the reason. It states in the notes, “Note that for rounding off a 5, the IEC 60559 standard is expected to be used, ‘*go to the even digit*’.” It is a little comforting knowing that there is a logic behind it and that R is abiding to some standard. But why isn’t MATLAB abiding by the same standard? Also, I think most people expect numbers ending in .5 to round up, not the nearest even digit.

**leave a comment**for the author, please follow the link and comment on their blog:

**Statistically Significant**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...