Correct Datetime / POSIXct behaviour for R and kdb+

February 3, 2009
By

(This article was first published on Thinking inside the box , and kindly contributed to R-bloggers)

We have started to look into kdb+ as a possible
high-performance column-store backend. Kx offers
free trials
— and so I have played with
this for a day or two, both the general system, data loads and dumps and in particular with
the interface to R,
Based on the few files (one C source with interface
code, one R file to access the C code, one object file to link against, one header
file and a simple Makefile), it took just a couple of minutes to turn this
into a proper CRAN-style R
package.

Anyway, the reason for this post was that the R / kdb+ glue code works well
… but not for datetimes. I really like to be able to pass date/time objects
natively between systems as easily as, say, numbers or strings (and see
e.g. my Rcpp package
for doing this with R and C++) and I was a bit annoyed when the millisecond
timestamps didn’t move smoothly. Turns out that the basic converter function in the
code had a number of problems: it converted to integer, only covered a
single scalar rather than vectorised mode, and erroneously reduced a
reference count. A better version, in my view, is as follows:

static SEXP from_datetime_kobject(K x) 
{
	SEXP result;
	int i, length = x->n;
	if (scalar(x)) {
		result = PROTECT(allocVector(REALSXP, 1));
		REAL(result)[0] = (kF(x)[0] + 10957) * 86400;
	} else {
		result = PROTECT(allocVector(REALSXP, length));
		for(i = 0; i < length; i++) {
		    	REAL(result)[i] = (kF(x)[i] + 10957) * 86400;
		}
	}
	SEXP datetimeclass = PROTECT(allocVector(STRSXP,2));
	SET_STRING_ELT(datetimeclass, 0, mkChar("POSIXt"));
	SET_STRING_ELT(datetimeclass, 1, mkChar("POSIXct"));
	setAttrib(result, R_ClassSymbol, datetimeclass);
	UNPROTECT(2); 
        return result; 
}

This deals with vectors as well as scalars, converts Kdb's 'fractional days
since Jan 1, 2000' to the Unix standard of seconds since the epoch --
including the R extension of fractional seconds -- and as importantly, sets
the class attributes to POSIXt POSIXct as needed by R. With
that, a simple select max datetime from table does just that,
and vectors of timestamped records of trades or quotes or whatever also
come with proper POSIXct behaviour into R. Note that it needs TZ to be set to UTC, though,
or you get a timezone offset you may not want.

To leave a comment for the author, please follow the link and comment on his blog: Thinking inside the box .

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.