**jottR**, and kindly contributed to R-bloggers)

If your native code takes more than a few seconds to finish, it is a nice courtesy to the user to check for user interrupts (Ctrl-C) once in a while, say, every 1,000 or 1,000,000 iteration. The C-level API of R provides `R_CheckUserInterrupt()`

for this (see 'Writing R Extensions' for more information on this function). Here's what the code would typically look like:

`for (int ii = 0; ii < n; ii++) {`

/* Some computational expensive code */

if (ii % 1000 == 0) R_CheckUserInterrupt()

}

This uses the modulo operator `%`

and tests when it is zero, which happens every 1,000 iteration. When this occurs, it calls `R_CheckUserInterrupt()`

, which will interrupt the processing and “return to R” whenever an interrupt is detected.

Interestingly, it turns out that, it is *significantly faster to do this check every k=2^{m} iteration*, e.g. instead of doing it every 1,000 iteration, it is faster to do it every 1,024 iteration. Similarly, instead of, say, doing it every 1,000,000 iteration, do it every 1,048,576 – not one less (1,048,575) or one more (1,048,577). The difference is so large that it is even 2-3 times faster to call

`R_CheckUserInterrupt()`

every 256 iteration rather than, say, every 1,000,000 iteration, which at least to me was a bit counter intuitive the first time I observed it.Below are some benchmark statistics supporting the claim that testing / calculating `ii % k == 0`

is faster for *k=2 ^{m}* (blue) than for other choices of

*k*(red).

Note that the times are on the log scale (the results are also tabulated at the end of this post). Now, will it make a big difference to the overall performance of your code if you choose, say, 1,048,576 instead of 1,000,000? Probably not, but on the other hand, it does not hurt to pick an interval that is a *2 ^{m}* integer. This observation may also be useful in algorithms that make lots of use of the modulo operator.

So why is `ii % k == 0`

a faster test when *k=2 ^{m}*? I can only speculate. For instance, the integer

*2*is a binary number with all bits but one set to zero. It might be that this is faster to test for than other bit patterns, but I don’t know if this is because of how the native code is optimized by the compiler and/or if it goes down to the hardware/CPU level. I’d be interested in feedback and hear your thoughts on this.

^{m}## Details on how the benchmarking was done

I used the inline package to generate a set of C-level functions with varying interrupt intervals *k*. I'm not passing *k* as a parameter to these functions. Instead, I use it as a constant value so that the compiler can optimize as far as possible, but also in order to imitate how most code is written. This is why I generate multiple C functions. I benchmarked across a wide range of interval choices using the microbenchmark package. The C functions (with corresponding R functions calling them) and the corresponding benchmark expressions to be called were generated as follows:

`## The interrupt intervals to benchmark`

## (a) Classical values

ks <- c(1, 10, 100, 1000, 10e3, 100e3, 1e6)

## (b) 2^k values and the ones before and after

ms <- c(2, 5, 8, 10, 16, 20)

as <- c(-1, 0, +1) + rep(2^ms, each=3)

## List of unevaluated expressions to benchmark

mbexpr <- list()

for (k in sort(c(ks, as))) {

name <- sprintf("every_%d", k)

## The C function

assign(name, inline::cfunction(c(length="integer"), body=sprintf("

int i, n = asInteger(length);

for (i=0; i < n; i++) {

if (i %% %d == 0) R_CheckUserInterrupt();

}

return ScalarInteger(n);

", k)))

## The corresponding expression to benchmark

mbexpr <- c(mbexpr, substitute(every(n), list(every=as.symbol(name))))

}

The actual benchmarking of the 25 cases was then done by calling:

`n <- 10e6 ## Number of iterations`

stats <- microbenchmark::microbenchmark(list=mbexpr)

expr | min | lq | mean | median | uq | max |
---|---|---|---|---|---|---|

every_1(n) | 174.05 | 178.77 | 184.68 | 180.76 | 183.97 | 262.69 |

every_3(n) | 66.78 | 69.16 | 72.10 | 70.20 | 72.42 | 114.75 |

every_4(n) | 53.80 | 55.31 | 56.98 | 56.32 | 57.26 | 69.71 |

every_5(n) | 46.17 | 47.52 | 49.42 | 48.83 | 49.99 | 66.98 |

every_10(n) | 33.31 | 34.32 | 36.58 | 35.12 | 36.66 | 54.83 |

every_31(n) | 23.78 | 24.45 | 25.74 | 25.10 | 25.83 | 58.10 |

every_32(n) | 17.81 | 18.25 | 18.91 | 18.82 | 19.22 | 25.25 |

every_33(n) | 22.90 | 23.58 | 24.90 | 24.59 | 25.26 | 34.45 |

every_100(n) | 18.14 | 18.55 | 19.47 | 19.15 | 19.63 | 27.42 |

every_255(n) | 19.96 | 20.56 | 21.67 | 21.16 | 21.98 | 42.53 |

every_256(n) | 7.07 | 7.18 | 7.54 | 7.40 | 7.63 | 10.73 |

every_257(n) | 19.32 | 19.72 | 20.60 | 20.36 | 20.85 | 29.66 |

every_1000(n) | 16.37 | 16.98 | 17.81 | 17.53 | 18.08 | 24.24 |

every_1023(n) | 19.54 | 20.16 | 20.94 | 20.50 | 21.25 | 28.20 |

every_1024(n) | 6.32 | 6.40 | 6.81 | 6.60 | 6.83 | 13.32 |

every_1025(n) | 18.58 | 19.05 | 19.91 | 19.74 | 20.08 | 30.51 |

every_10000(n) | 15.92 | 16.76 | 17.40 | 17.38 | 17.82 | 24.10 |

every_65535(n) | 18.92 | 19.60 | 20.41 | 20.10 | 20.80 | 27.69 |

every_65536(n) | 6.08 | 6.16 | 6.62 | 6.39 | 6.57 | 13.40 |

every_65537(n) | 22.08 | 22.70 | 23.79 | 23.69 | 24.35 | 31.57 |

every_100000(n) | 16.16 | 16.55 | 17.20 | 17.05 | 17.61 | 24.54 |

every_1000000(n) | 16.02 | 16.42 | 17.17 | 16.85 | 17.42 | 21.84 |

every_1048575(n) | 18.88 | 19.23 | 20.27 | 19.85 | 20.52 | 30.21 |

every_1048576(n) | 6.08 | 6.18 | 6.53 | 6.47 | 6.58 | 12.64 |

every_1048577(n) | 22.88 | 23.23 | 24.28 | 23.83 | 24.63 | 31.84 |

I get similar results across various operating systems (Windows, OS X and Linux) all using GNU Compiler Collection (GCC).

Feedback and comments are welcome!

To reproduce these results, do:

`> path <- 'https://raw.githubusercontent.com/HenrikBengtsson/jottr.org/master/blog/20150604%2CR_CheckUserInterrupt'`

> html <- R.rsp::rfile('R_CheckUserInterrupt.md.rsp', path=path)

> !html ## Open in browser

**leave a comment**for the author, please follow the link and comment on their blog:

**jottR**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...