Blog Archives

R at 12,000 Cores

October 16, 2012
By

I am very happy to introduce a new set of packages that has just hit the CRAN. We are calling it the Programming with Big Data in R Project, or pbdR for short (or as I like to jokingly refer to it, 'pretty bad for dyslexics'). You can find out more about the pbdR project at http://r-pbd.org/ The packages are...

Read more »

Some Quirks of the R Language

August 14, 2012
By

R is my favorite programming language.  It's just so useful for getting work done.  Sometimes people will complain that R is a difficult language.  To me, this begs the questions:  difficult for what?  And for whom?  I personally think R is just about the easiest thing in the world for prototyping.  Meaning if you want to quickly crank out...

Read more »

Autoplot: Graphical Methods with ggplot2

June 11, 2012
By
Autoplot:  Graphical Methods with ggplot2

Background As of ggplot2 0.9.0 released in March 2012, there is a new generic function autoplot.  This uses R's S3 methods (which is essentially oop for babies) to let you have some simple overloading of functions.  I'm not going to get deep into oop, because honestly we don't need to. The idea is very simple.  If I say "I'm...

Read more »

Visualizing the CRAN: Graphing Package Dependencies

May 17, 2012
By
Visualizing the CRAN:  Graphing Package Dependencies

I had been meaning to start toying with the igraph package for a while. So a few weeks ago (lay off, I'm busy), I decided to grab a bunch of CRAN data about package dependencies. The easiest way that I could think to get this information was to just grab the html files for all the package descriptions and...

Read more »

Project Euler…in LaTeX?

April 23, 2012
By

I've been joking for a while now that I was going to start solving project euler problems in LaTeX.  Then today I finally did one.  So let's talk about solving Project Euler problem number 1 (the easy one) using only LaTeX. The problem asks you to sum up all the positive integers below 1000 which are divisible by 3...

Read more »

Statistical Software Popularity on Google Scholar

April 12, 2012
By
Statistical Software Popularity on Google Scholar

Background (probably boring) Several months ago, my boss and I were discussing how he got the data for his software popularity article; the rest of the background discussion pertains to those plots, so I would recommend going over to take a look before continuing on (or just skip to the next section if you're impatient).  Specifically, we were talking...

Read more »

A No BS Guide to the Basics of Parallelization in R

March 15, 2012
By

What is parallelization?Parallelization is using multiple processing cores to, hopefully, make your programs run faster than serial code, which is the use of just one processing core. Parallel code is not always faster than its serial counterpart (but if you're doing it right and you're careful about what you parallelize, it will be --- remember, that's your goal here). ...

Read more »

Sorting in R as Inefficiently as Possible

January 12, 2012
By
Sorting in R as Inefficiently as Possible

My last post of substance was all about improving your performance using R to answer programming questions that might be asked during a job interview.  So let's say you nailed the interview and got the job, but you desperately want to be fired for grand incompetence.  Never fear, your pal at librestats once again has your back. The sleep...

Read more »

Honing Your R Skills for Job Interviews

January 9, 2012
By
Honing Your R Skills for Job Interviews

My time as a grad student will soon draw to a close. With this comes the terrifying realisation that I'm going to start applying for jobs and, hopefully, interviewing soon, forever leaving my comfortable security blanket of academia. With that horrible thought in mind, I've been doing some poking around to see what various kinds of technical interviews are...

Read more »

R Fork Bomb

September 14, 2011
By
R Fork Bomb

So maybe I’m a strange guy, but I think fork bombs are really funny.  What’s a fork bomb?  The basic premise is that you spawn a process that spawns a process that spawns a process…, ad infinitum. The most beautiful example of a fork bomb, and really one of the most beautiful lines of code

Read more »