2125 search results for "Twitter"

An analysis of the Stackoverflow Beta sites

November 1, 2010
By
An analysis of the Stackoverflow Beta sites

In the last six months or so, the behemoth of Q & A sites stackoverflow, decided to change tack and launch a number of other non-computing-language sites. To launch a site in the stackoverflow family, sites have to spend time gathering followers in Area51. Once a site has gained a critical mass, a new StackExchange

Read more »

ABC lectures [finale]

October 31, 2010
By
ABC lectures [finale]

The latest version of my ABC slides is on slideshare. To conclude with a pun, I took advantage of the newspaper clipping generator once pointed out by Andrew. (Note that nothing written in the above should be taken seriously.) On the serious side, I managed to cover most of the 300 slides (!) over the

Read more »

Presenting Immer’s barley data

October 31, 2010
By
Presenting Immer’s barley data

Last time I talked about adapting graphs for presentations.  This time I’m putting some of the concepts I discussed there into action, with a presentation of Immer’s barley dataset.  This is a classic dataset, originally published in 1934; in 1993 Bill Cleveland mentioned it in his book Visualising Data on account of how it may

Read more »

Errors in Ghcn Inventories

October 30, 2010
By
Errors in Ghcn Inventories

In the debate over the accuracy of the global temperature nothing is more evident than errors in the location data for stations in the GHCN inventory. That inventory is the primary source for all the temperature series. One question is “do these mistakes make a difference?” If one believes as I do that the record

Read more »

A question from the R list

October 30, 2010
By
A question from the R list

I am currently working on rectifying the GHCN station list to improve the location information. Its the kind of database work that is mind numbingly tedious and a PITA in R. not because R lacks capabilities, its just tough and not very sexy to matching and fuzzy matching and greping and blah blah blah. Instead,

Read more »

Findings increasingly novel, scientists say…

October 29, 2010
By
Findings increasingly novel, scientists say…

…was the tongue-in-cheek title of an image that I posted to Twitpic this week. It shows the usage of the word “novel” in PubMed article titles over time. As someone correctly pointed out at FriendFeed, it needs to be corrected for total publications per year. It was inspired by a couple of items that caught

Read more »

Adapting graphs for presentations

October 28, 2010
By
Adapting graphs for presentations

I’ve just finished reading slide:ology by Nancy Duarte. It contains lots of advice about how to convey meaning through aesthetics. The book has a general/business presentation focus, but it got me wondering about how to apply the ideas in a scientific context.  Since graphs from a big part of most scientific talks, and since that’s

Read more »

Random generators for parallel processing

October 28, 2010
By
Random generators for parallel processing

Given the growing interest in parallel processing through GPUs or multiple processors, there is a clear need for a proper use of (uniform) random number generators in this environment. We were discussing the issue yesterday with Jean-Michel Marin and briefly looked at a few solutions: given p parallel streams/threads/processors, starting each generator with a random

Read more »

Where People Share Links About NYC

October 27, 2010
By
Where People Share Links About NYC

Last week I participated in bit.ly’s fourth hackabit hack-a-thon, which is a wonderful opportunity for NYC area hackers to get together, eat pizza, drink energy drinks, and stay up late hacking with some of the best data geeks around. I was lucky enough to saddle up next to Hilary Mason, bit.ly’s lead scientist, recently named

Read more »

Parametric Bootstrap Power Analysis of GISS Temp Data

October 24, 2010
By
Parametric Bootstrap Power Analysis of GISS Temp Data

Previosly, I calculated a bunch of ad-hoc power curves from GISTEMP data. Power is essentially a reframing of the p-value, to see the significance of the trend lines in the global temps. However, power calculations are inherently very noisy, hence, my ad-hoc way of aggregating the data. Another method is to bootstrap through the responses

Read more »