# An updated look at the #code2013 language rankings

A few weeks ago I compared the #code2013 rankings from twitter to TIOBE's rankings although when I had collected the #code2013 data people were still chiming in, albeit at a slowing pace. As I would visually scan the new tweets it seemed like there was a huge increase of Delphi & Object Pascal compared to the data I had collected previously, and it made me curious if this was a real effect or just coincidence. Luckily I had continued to collect the #code2013 data after I made that post so I had an opportunity to find out, considering I had 6028 tweets giving me 1404 more than the last time.

At the same time, I commented in my original post that I was unhappy with the mechanism which I used to strip manual retweets (i.e. manually adding

*RT*instead of a built-in retweet), as I had removed any tweet from the data which contained a

*RT*. Because people often add commentary to the left of the

*RT*, I created a new function which would leave anything to the left of the

*RT*(as well as

*MT)*which should leave more useable data. This code now appears in the github version of twitter as the function

*strip_retweets()*. Unfortunately, this didn’t make much of a difference – applying this new function to the original data set only gave me 23 more tweets worth of data, oh well. It was the thought that counted.

I processed the new dataset the same as the previous batch (all code included as a single gist below), and sure enough there was a large skew toward Delphi & Pascal in this batch. Note that I had tried to morph any usage of “object pascal” into a single “delphi/object pascal” entry, but presumably most people mentioning “pascal” mean delphi:

So despite the inclusion of about 30% more data, the results are very similar. So what happens if we look at the updated data against the TIOBE data as I did the first time?

Sure enough – when visually compared to the original, the pascal entries gained quite a lot (bouncing one of my favorites, Scala, down a tier). There were some other changes, most notably abap & c# gained while fortran lost but only ABAP had a very noticeable gain.

What happens if we only look at the new tweets against the TIOBE rankings. How much of a skew would Delphi show now?

As expected, Delphi took a huge leap forward. Also expected, some of the fringe languages fell off of this plot – which makes sense as we have about a third of the data so fewer opportunities to make the grade. You can also see some languages like R (another favorite) and ObjC dropping while others like Haskell and Matlab gaining.

So what happened? It seems reasonable to me to expect a fairly steady distribution over time, although clearly the social aspect to Twitter is affecting things causing viral gains and losses over time.

