I’ve written before about the Julia language. As someone who is very active in the R community, I am biased of course, and have been (and remain) a skeptic about Julia. But I would like to report on a wonderful talk I attended today at Stanford. To my surprise and delight, the speaker, Viral Shah of Julia Computing Inc, focused on the “computer science-y” details, i.e. the internals and the philosophy, quite interesting and certainly very impressive.
I had not previously known, for instance, how integral the notion of typing was in Julia, e.g. integer vs. float, and the very extensive thought processses in the Julia group that led to this emphasis. And it was fun to see the various cool Julia features that appeal to a systems guy like me, e.g. instance view of the assembly language implemented of a Julia function.
I was particularly interested in one crucial aspect that separates R from other languages that are popular in data science applications — NA values. I asked the speaker about that during the talk, only to find that he had anticipated this question and had devoted space in his slides to it. After covering that topic, he added that this had caused considerable debate within the Julia team as to how to handle it, which turned out to be something of a compromise.
Well, then, given this latest report on Julia (new releases coming soon), what is MY latest? How do I view it now?
As I’ve said here before, the fact that such an eminent researcher and R developer, Doug Bates of the University of Wisconsin, has shifted his efforts from R to Julia is enough for me to hold Julia in high regard, sight unseen. I had browsed through some Julia material in the past, and had seen enough to confirm that this is a language to be reckoned with. Today’s talk definitely raised my opinion of the language even further. But…
I am both a computer scientist and a statistician. Though only my early career was in a Department of Statistics (I was one of the founders of the UC Davis Stat. Dept.), I have done statistics throughout my career. And my hybrid status plays a key role in how I view Julia.
As a computer scientist, especially one who likes to view things at the systems levels, Julia is fabulous. But as a statistician, speed is only one of many crucial aspects of the software that I write and use. The role of NA values in R is indispensable, I say, not something to be compromised. And even more importantly, what I call the “helper” infrastructure of R is something I would be highly loathe to part with, things like naming of vector elements and matrix rows for instance. Such things have led to elegant solutions to many problems in software that I write.
And though undoubtedly (and hopefully) more top statisticians like Doug Bates will become active Julia contributors, the salient fact about R, as I always say, is that R is written for statisticians by statisticians. It matters. I believe that R will remain the language of choice in statistics for a long time to come.
And so, though my hat is off to Viral Shah, I don’t think Julia is about to “go viral” in tne stat world in the foreseeable future.