Better logging in R (aka futile.logger 1.3.0 released)

March 10, 2013
By

(This article was first published on Cartesian Faith » R, and kindly contributed to R-bloggers)

In many languages logging is now part of the batteries included with a language. This isn’t yet the case in R so most people make do with cat commands laced through their code. Then after development, when the system is being run for real, a lot of those cat statements are commented out so it doesn’t fill up log files and hide real error messages. When there is a bug, the cat commands are pulled down from the attic and so on. It would be far easier to have a simple logging utility that managed the various debug statements in your code such that they could all be turned on and off as a set or by severity.

This dream is no longer futile. The latest release of my package futile.logger has been rewritten with usability in mind (and is now built on lambda.r). It itself comes with batteries included and requires zero configuration to work.

Basics

You can use futile.logger right out of the box as a replacement for cat. Here are some examples. Notice that futile.logger decorates log statements with generally useful information, like the logging level and a timestamp.

library(futile.logger)
flog.info("My first log statement with futile.logger")
flog.warn("This statement has higher severity")
flog.fatal("This one is really scary")

Format strings are built into the logging system, so log statements can be dynamic based on the data.

flog.info("This is my %s log statement, %s", 'second', 'ever')
x <- rnorm(400)
flog.info("The variance of the sample is %s", var(x))

Use the flog.threshold function to choose which messages should be displayed. If you no longer want to see INFO-level messages, then change the threshold to WARN.

flog.threshold(WARN)
flog.info("Log statement %s is hidden!", 3)
flog.warn("However warning messages will still appear")

Concepts

To take full advantage of futile.logger, it is important to understand the handful of concepts that the package introduces. For anyone familiar with log4j or its myriad clones this should be familiar.

LOGGERS

A logger is simply a namespace bound to a threshold, an appender, and a layout. When futile.logger is loaded, a single logger is created, which is known as the ROOT logger. All other loggers inherit from the ROOT logger in terms of namespace and also configuration. The ROOT logger has a threshold of INFO, logs to standard out, and has a simple layout for the messages.

Loggers are configured automatically whenever they are referenced (for example when changing the threshold, or logging to a logger) inheriting the settings of the ROOT logger. Suppose you want a separate namespace for data i/o. You can make log messages that reference a logger data.io simply by specifying a different logger name.

> flog.threshold(INFO)
NULL
> flog.info("Loading data.frame", name='data.io')

Even though no logger was explicitly defined, futile.logger knows how to print messages since the logger data.io inherits from ROOT. Having a new logger is nice, but it’s not particularly useful if it behaves identically to the default logger. What if we want to see TRACE level messages for our data i/o but not clutter up the rest of our logging?

> flog.threshold(TRACE, name='data.io')
NULL
> 
> flog.trace("Connecting to data store")

Whoops, this is gong to the ROOT logger, so obviously it won’t print (since it’s still at INFO). Let’s route the message to the data.io logger we referenced earlier.

> flog.trace(“Connecting to data store”, name=’data.io’)
TRACE [2013-03-10 16:03:40] Connecting to data store

That looks better. This is how the hierarchy works since futile.logger will use values specified higher in the hierarchy if they don’t exist for the current logger. We can even change the appender for data.io to write to a file instead.

> flog.appender(appender.file("data.io.log"), name="data.io")
NULL
> flog.trace("Connecting to data store", name='data.io')

What about intermediate points in the hierarchy? Suppose we have other data operations logging to the logger data. We want these to be at TRACE, but we want them to continue writing to the console.

> flog.threshold(TRACE, name='data')
NULL
> flog.trace("Merging data.frames", name='data')
TRACE [2013-03-10 16:10:14] Merging data.frames
> flog.trace("Connecting to data store", name='data.io')

So now the logger data will print all log messages to console, while data.io will write to a file. This should give you a good idea of how to manipulate loggers in futile.logger.

Logging Levels and Thresholds

It’s probably clear that there are multiple log levels available. Each log message has an associated log level that ranges from TRACE to FATAL. The levels form a hierarchy that not only indicate the severity of the message but also when it should be displayed. This is controlled by the threshold, which only allows messages with a severity greater than or equal to the threshold to be displayed. When futile.logger is loaded, the default log level is INFO. To change the logging threshold use the futile.threshold function. The function has two arguments: the new threshold and the name of the logger. The available thresholds are the same as the log levels: TRACE, DEBUG, INFO, WARN, ERROR, FATAL. Choose the threshold that you want to see and all messages of that severity and higher will display.

This means that the default INFO threshold will display all log messages with a log level of INFO, WARN, ERROR, FATAL. If you only want to see errors, then set the threshold to ERROR.

flog.threshold(ERROR)
flog.info("Will print: %s", FALSE)
flog.warn("Will print: %s", FALSE)
flog.error("Will print: %s", TRUE)
flog.fatal("Will print: %s", TRUE)

With this simple mechanism you can add log statements in your code and control the display of the messages by simply changing the logging threshold. Hence for development keep the threshold on INFO, for debugging change to TRACE, and for production set to WARN.

The name argument determines the logger itself, as discussed above.

Appenders

An appender defines where a log message goes. The default is to standard out, but it is also possible to write log messages to files, URLs, message queues, etc. Writing your own appender is as simple as creating a function that has the following signature.

appender.fn <- function(line) { }

Assigning it to a given logger follows the same convention as for layouts.

flog.appender(appender.fn)
flog.appender(appender.fn, name='test.logger')

Similarly calling the function without an explicit logger will bind this appender to the ROOT logger.

The following two appenders are provided out of the box in futile.logger. Each function returns a function, so when binding an appender to a logger, be sure to execute the top level function.

  • appender.console()
  • appender.file(filename)

LAYOUTS

Suppose you want to display log messages in a different format? To do this you would create a custom layout function, which is responsible for returning a string that represents the log message. The function interface is

layout.fn <- function(level, message, ...) { }

where the ellipsis argument represents represent any arguments passed via the logging functions described above. You set the layout by assigning the layout function to the given logger.

flog.layout(layout.fn)
flog.layout(layout.fn, name='test.logger')

If you don’t provide a logger, futile.logger will use the ROOT logger. This means all loggers will be impacted by the change in layout (except loggers that explicitly defined a different layout).

Package Support

If you happen to use a package that uses futile.logger, you can control logging of messages interactively on a per package basis. That means you, as the package user, decides how much logging information you wish to see for a given package.

> library(tawny)
> library(fractalrock)
> prices <- getPortfolioPrices(LETTERS[1:10], 100)
> returns <- apply(prices,2,Delt)[-1,]
> s <- cov.shrink(as.xts(returns,order.by=index(prices)[-1]))
<em id="__mceDel">INFO [2013-03-08 09:52:07] Got intensity k = 140.006097751888 and coefficient d = 1</em>
> flog.threshold(WARN, name='tawny')
NULL
> s <- cov.shrink(as.xts(returns,order.by=dates[-1]))
> flog.info("Got covariance matrix")
INFO [2013-03-08 09:55:25] Got covariance matrix

Hence by using futile.logger, package authors can balance their debugging needs with a user’s usage needs. It also means that for debugging issues, a user can change the log threshold for the package and give a more detailed output to the package maintainer.

Conclusion

Futile.logger is now more robust and easier to use. Install from CRAN or read more documentation and the source on the github page.


To leave a comment for the author, please follow the link and comment on his blog: Cartesian Faith » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.