Open Once, Close Automatically: A Ressource Connection Pattern for R

[This article was first published on R on R & Data Analysis - Eric Stemmler, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Why you should care – at least a little bit

Dangling database connections rarely cause immediate data leaks, but they are a common root cause of outages, degraded performance, and availability incidents. In regulated or public-sector systems, availability failures can have legal and contractual consequences. Managing connections properly is therefore not just a technical concern, but a reliability concern.

https://devclass.com/2020/03/27/github-reveals-database-infrastructure-was-the-villain-behind-february-spate-of-outages-again/

The third incident on February 25, again involved ProxySQL, when “active database connections crossed a critical threshold that changed the behavior of this new infrastructure. Because connections remained above the critical threshold after remediation, the system fell back into a degraded state.”

These kinds of failures may seem operational, but in production environments they can trigger SLA breaches, customer impact, and potential regulatory risk — particularly where availability and resilience are part of compliance requirements (e.g., GDPR’s Article 32 for systems processing personal data). Properly managing connections — opening them only when needed and ensuring deterministic cleanup — is therefore not just “good hygiene” in R programming, but a practical aspect of building robust, reliable systems.

These risks are also relevant in the R ecosystem, where database access is common in long-running processes such as Shiny applications, plumber APIs, and scheduled data pipelines. While R is often associated with academic analysis, it is also widely used in production systems across government and industry. In Shiny apps, for example, each user session may open database connections, and poorly managed connections can quickly accumulate and exhaust backend resources. Because these applications are typically long-lived, small connection leaks can silently degrade performance or cause outages. Proper connection management is therefore not just good practice in R—it is essential for building reliable, scalable, and production-ready applications.

This post introduces a pattern using around on.exit() that makes connections safe, automatic, and worry-free.

The Pattern

From the R documentation of function on.exit:

on.exit records the expression given as its argument as needing to be executed when the current function exits (either naturally or as the result of an error). This is useful for resetting graphical parameters or performing other cleanup actions.

This is a versatile feature of R-funcitons and we can use it to build a connection function that will ensure the connection is closed again. In essence, we want to build a function that can do something like this:

get_data <- function() {
  connect()
  DBI::dbGetQuery(...)
}

In order for this to work, connect() needs to record the R-expression that will close the opened connection when it’s calling function exits, in R-parlance this is called the parent frame and (unsurprisingly) it can be retrieved via function parent.frame(). Equipped with these two functions we can now write our connect() function:

connect <- function() {
  caller <- parent.frame()
  # If connection already exists in this frame, reuse it
  if (exists(".db", envir = caller, inherits = FALSE)) {
    return(get(".db", envir = caller))
  }
  # Create new connection
  db <- dbConnect(odbc::odbc(), "mot")
  # Store it in the caller's frame
  assign(".db", db, envir = caller)
  # Ensure cleanup when the calling function exits
  do.call(
    what = on.exit,
    args = list(
      quote({
        if (exists(".db", inherits = FALSE)) {
          dbDisconnect(.db)
          rm(.db)
        }
      }),
      add = TRUE
    ),
    envir = caller
  )
  db
}

Note also in the above function definition that in order to be able to reuse an opened connection, that the conncetion Object (db) needs to be stored somewhere. Returning it as a result of the function is no guarantee that it will be assigned to a symbol. That’s problematic because everything that isn’t bound to a symbol in a R session is subject to R’s garbage collection. That’s why, in addition it is also assigned to a variable called .db into the calling frame.

One problem with this implementation is that its vurnable to a name collision: We do not know whether there would already exist a symbol in the calling frame that - perhaps coincidentally - has the same name .db, but represents perhaps something entirely different.

To remedy this, we can explicitly create a closure that is, a function that carries it’s own environment. For more details see Hadley Wickham’s explanation here.

I’m providing here a toy example, where our connection object is represented by a random 3-letter string. Setting this variable to NULL is meant to represent to disconnect from the ressource. You can replace these lines with your code for actually connecting and disconnecting from the ressource.

connect <- local({
  # Private environment, created once
  state <- new.env(parent = emptyenv())
  function() {
    caller <- parent.frame()
    if (identical(caller, .GlobalEnv)) {
      warning("connect() was called from Global Environment, cannot perform automatic disconnect")
    }

    # If connection already exists, reuse it
    if (exists("db", envir = state, inherits = FALSE)) {
      cat("connection already exists", state$db, '\n')
    } else { 
      # Create connection
      state$db <- paste0("db_", paste0(sample(letters, 3), collapse = ''))
      # Ensure cleanup when the *calling* function exits
      cleanup <- function() {
        if (exists("db", envir = state)) {
          # dbDisconnect(state$db)
          cat("disconnecting", state$db, '\n')
          state$db <- NULL
          rm("db", envir = state)
        }
      }
      # do.call(on.exit, list(quote(cleanup()), add = TRUE), envir = caller)
      do.call(on.exit, list(
      substitute(FUN(), list(FUN = cleanup)), add = TRUE), envir = caller)
      cat("new connection", state$db, '\n')
    }
  }
})

foo <- function() {
  cat("inside foo\n")
  connect()
  bar()
}

bar <- function() {
  cat("inside bar\n")
  goo()
}

goo <- function() {
  cat("inside goo\n")
  connect()
}

foo()
## inside foo
## new connection db_kpt 
## inside bar
## inside goo
## connection already exists db_kpt 
## disconnecting db_kpt
goo()
## inside goo
## new connection db_npc 
## disconnecting db_npc

This pattern uses a closure to hold the connection state and on.exit() to register cleanup logic in the caller’s frame. Applying substitute() to the cleanup function is important, because our function definition only lives within the created closure, not in the parent’s frame. What substitute() does is to substitute the symbol (FUN) by our cleanup functions unevaluated parsing tree. This in itself again is a closure (a function plus an attached environment) which retains access to its private state even when evaluated later. The result is a safe, idempotent connection manager that works transparently across nested function calls without leaking state or requiring explicit teardown.

No global state, no name collisions, and no burden on the caller.

Note that, when connect() is run in the Global Environment, the expression provided to on.exit() will evaluated and it’s the responsibility of the caller to close the connection manually.

connect()
new connection db_ykc
Warning:
In connect() : connect() was called from Global Environment, cannot perform automatic disconnect
To leave a comment for the author, please follow the link and comment on their blog: R on R & Data Analysis - Eric Stemmler.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)