Site icon R-bloggers

Rounding

[This article was first published on pharmaverse blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< !--------------- typical setup -----------------> < !--------------- post begins here -----------------> < section id="rounding-methods" class="level2">

Rounding methods

Both SAS and base R have the function round(), which rounds the input to the specified number of decimal places. However, they use different approaches when rounding off a 5:

Although base R does not have the option for “round half up”, there are functions available in other R packages (e.g., janitor, tidytlg).

In general, there are many often used rounding methods. In the table below, you can find examples of them applied to the number 1.45.

round half up round to even round up round down round towards zero
Example: 1.45

1.5

(round to 1 decimal place)

1.4

(round to 1 decimal place)

2 1 1

Here are the corresponding ways to implement these methods in SAS and R.

round half up round to even round up round down round towards zero
SAS round() rounde() ceil() floor() int()
R

janitor::round_half_up()

tidytlg::roundSAS()

base::round()

base::ceiling()

base::floor()

base::trunc()

This table is summarized from links below, where more detailed discussions can be found –

< section id="round-half-up-in-r" class="level2">

Round half up in R

The motivation for having a ‘round half up’ function is clear: it’s a widely used rounding method, but there are no such options available in base R.

There are multiple forums that have discussed this topic, and quite a few functions already available. But which ones to choose? Are they safe options?

The first time I needed to round half up in R, I chose the function from a PHUSE paper and applied it to my study. It works fine for a while until I encountered the following precision issue when double programming in R for TLGs made in SAS.

< section id="numerical-precision-issue" class="level3">

Numerical precision issue

Example of rounding half up for 2436.845, with 2 decimal places:

# a function that rounds half up
# exact copy from: https://www.lexjansen.com/phuse-us/2020/ct/CT05.pdf
ut_round <- function(x, n = 0) {
  # x is the value to be rounded
  # n is the precision of the rounding
  scale <- 10^n
  y <- trunc(x * scale + sign(x) * 0.5) / scale
  # Return the rounded number
  return(y)
}
# round half up for 2436.845, with 2 decimal places
ut_round(2436.845, 2)
[1] 2436.84

The expected result is 2436.85, but the output rounds it down. Thanks to the community effort, there are already discussions and resolution available in a StackOverflow post

There are numerical precision issues, e.g., round2(2436.845, 2) returns 2436.84. Changing z + 0.5 to z + 0.5 + sqrt(.Machine$double.eps) seems to work for me. – Gregor Thomas Jun 24, 2020 at 2:16

After the fix:

# revised rounds half up
ut_round1 <- function(x, n = 0) {
  # x is the value to be rounded
  # n is the precision of the rounding
  scale <- 10^n
  y <- trunc(x * scale + sign(x) * 0.5 + sqrt(.Machine$double.eps)) / scale
  # Return the rounded number
  return(y)
}
# round half up for 2436.845, with 2 decimal places
ut_round1(2436.845, 2)
[1] 2436.85
< section id="we-are-not-alone" class="level3">

We are not alone

The same issue occurred in the following functions/options as well, and has been raised by users:

< section id="which-ones-to-use" class="level3">

Which ones to use?

The following functions have the precision issue mentioned above fixed, they all share the same logic from this StackOverflow post.

< section id="are-they-safe-options" class="level3">

Are they safe options?

Those “round half up” functions do not offer the same level of precision and accuracy as the base R round function.

For example, let’s consider a value a that is slightly less than 1.5. If we choose round half up approach to round a to 0 decimal places, an output of 1 is expected. However, those functions yield a result of 2 because 1.5 - a is less than sqrt(.Machine$double.eps).

a <- 1.5 - 0.5 * sqrt(.Machine$double.eps)
ut_round1(a, 0)
[1] 2
janitor::round_half_up(a, digits = 0)
[1] 2

This behavior aligns the floating point number comparison functions all.equal() and dplyr::near() with default tolerance .Machine$double.eps^0.5, where 1.5 and a are treated as equal.

all.equal(a, 1.5)
[1] TRUE
dplyr::near(a, 1.5)
[1] TRUE

We can get the expected results from base R round as it provides greater accuracy.

round(a)
[1] 1

Here is an example when base R round reaches the precision limit:

# b is slightly less than 1.5
b <- 1.5 - 0.5 * .Machine$double.eps
# 1 is expected but the result is 2
round(b)
[1] 2

The precision and accuracy requirements can vary depending on the application. Therefore, it is essential to be aware each function’s performance in your specific context before making a choice.

< section id="conclusion" class="level2">

Conclusion

With the differences in default behaviour across languages, you could consider your QC strategy and whether an acceptable level of fuzz in the electronic comparisons could be allowed for cases such as rounding when making comparisons between 2 codes written in different languages as long as this is documented. Alternatively you could document the exact rounding approach to be used in the SAP and then match this regardless of programming language used. – Ross Farrugia

Thanks Ross Farrugia, Ben Straub, Edoardo Mancini and Liming for reviewing this blog post and providing valuable feedback!

If you spot an issue or have different opinions, please don’t hesitate to raise them through pharmaverse/blog!

< !--------------- appendices go here ----------------->
< section id="last-updated" class="level2 appendix">

Last updated

2024-04-29 13:13:41.901485

< section id="details" class="level2 appendix">

Details

source code, R environment

< section class="quarto-appendix-contents" id="quarto-reuse">

Reuse

CC BY 4.0
< section class="quarto-appendix-contents" id="quarto-citation">

Citation

BibTeX citation:
@online{zhang2023,
  author = {Zhang, Kangjie},
  title = {Rounding},
  date = {2023-08-22},
  url = {https://pharmaverse.github.io/blog/posts/2023-07-24_rounding/rounding.html},
  langid = {en}
}
For attribution, please cite this work as:
Zhang, Kangjie. 2023. “Rounding.” August 22, 2023. https://pharmaverse.github.io/blog/posts/2023-07-24_rounding/rounding.html.
To leave a comment for the author, please follow the link and comment on their blog: pharmaverse blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version