Julia, I Love You

March 31, 2012
By

(This article was first published on John Myles White » Statistics, and kindly contributed to R-bloggers)

Julia is a new language for scientific computing that is winning praise from a slew of very smart people, including Harlan Harris, Chris Fonnesbeck, Douglas Bates, Vince Buffalo and Shane Conway. As a language, it has lofty design goals, which, if attained, will make it noticeably superior to Matlab, R and Python for scientific programming. In the core development team’s own words:

We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

(Did we mention it should be as fast as C?)

Remarkably, Julia seems to be on its way to meeting those goals. Last night, I decided to see for myself whether Julia would live up to the hype. So I taught myself just enough of the language to write an implementation of the slowest R code I’ve ever written: the Metropolis algorithm-style sampler Drew and I use in Chapter 7 of Machine Learning for Hackers to show off randomized, iterative optimization algorithms. You can find both the original R code and my new Julia code on GitHub in two files name cipher.R and cipher.jl, respectively.

In my opinion, the new code in Julia is easier to read than the R code because Julia has fewer syntactic quirks than R. More importantly, the Julia code runs much faster than the R code without any real effort put into speed optimization. For the sample text I tried to decipher, the Julia code completes 50,000 iterations of the sampler in 51 seconds, while the R code completes the same 50,000 iterations in 67 minutes — making the R code more than 75 slower than the Julia code.

Having seen that example alone, I would be convinced Julia is a real contender for the future of scientific computing. But this iterative sampling algorithm is not close to being the harshest comparison between Julia and R on my machine. For a more powerful example (lifted straight from the Julia docs), we can compare Julia and R code for computing the 25th Fibonacci number recursively.

First, the Julia code:

1
2
fib(n) = n < 2 ? n : fib(n - 1) + fib(n - 2)
@elapsed fib(25)

Second, the R code:

1
2
3
4
5
6
7
8
9
fib <- function(n)
{
  ifelse(n < 2, n, fib(n - 1) + fib(n - 2))
}
 
start <- Sys.time()
fib(25)
end <- Sys.time()
end - start

The Julia code takes around 8 milliseconds to complete, whereas the R code takes around 4000 milliseconds. In this case, R is 500 times slower than Julia. To me, that’s sufficient reason to want to start focusing my time on implementing the algorithms I care about in Julia. I hope others will consider doing the same.

To leave a comment for the author, please follow the link and comment on his blog: John Myles White » Statistics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags:

Comments are closed.