Posts Tagged ‘ Data ’

Tumblr Likes

April 11, 2011
By
Tumblr Likes

Look at just the first digit and the number of digits. science: 32914, 11566, 4989, 3743, 968, 814, 673, 482, 286, 2811 black and white: 1694, 1167, 1108, 988, 919, 639, 596, 591, 580, 544 lol: 22627, 18100, 17688, 14374, 13459, 12045, 4711, 3779, 36...

Read more »

Fixing aberrant files using R and the shell: a case study

April 7, 2011
By
Fixing aberrant files using R and the shell: a case study

Once in a while, you embark on what looks like a simple computational procedure only to encounter frustration very early on. “I can’t even read my file into R!” you cry. Step back, take a deep breath and take note of what the software is trying to tell you. Most times, you’ve just missed something

Read more »

Updated infochimps R package, includes several new APIs

March 21, 2011
By

Recently, the good folks at Infochimps.com rolled out a series of new APIs to add to their already impressive set of data resources. I have been in a perpetual state of catch-up since the new year, so I have only now got around to adding some of these new APIs to the infochimps R package. Here

Read more »

More fun with sed

March 18, 2011
By

So I have this strange date and time string, which I would like to convert to a “useable” date, i.e., something that a spreadsheet programme or R can work with. It looks like this (MON has 3 chars): ddMONyr:hh:mm:ss The … Continue reading →

Read more »

Data from last post

March 1, 2011
By
Data from last post

Posting the code I used in the last post wasn't that useful unless I also posted the data set. Here's the data. These are made up data, but it is a nice data set for illustrating how to conduct a regression. Enjoy!

Read more »

How to read and write Stata data (.dta) files into R

February 24, 2011
By
How to read and write Stata data (.dta) files into R

Here's an R tutorial where I explain how to read Stata data files into R (even if you don't own the program Stata). I also offer some other basic tips.Of note, you can also write Stata .dta files from R (if your coauthors or journals insist on having ...

Read more »

HRSA Area Resource File Format 2009

February 23, 2011
By

From the HRSA website: is a database containing more than 6,000 variables for each of the nation’s counties. ARF contains information on health facilities, health professions, measures of resource scarcity, health status, economic activity, health training programs, and socioeconomic and environmental characteristics. The data file itself is formatted accordingly (from the ARF

Read more »

Dataset: Wisconsin Union Protester Tweets #wiunion

February 21, 2011
By
Dataset: Wisconsin Union Protester Tweets #wiunion

   I’ve been playing with Twitter data over the last week, archiving Algerian, Egyptian, Iranian, and Chinese tweets.  I thought I’d bring the story a little closer to home this time by archiving tweets from Wisconsin Union protesters on the … Continue reading →

Read more »

Tracking the Frequency of Twitter Hashtags with R

February 21, 2011
By
Tracking the Frequency of Twitter Hashtags with R

 I’ve posted three examples of Twitter hashtags datasets in the last week: one on China, one on Iran, and one on Algeria.  In order to build these datasets, I needed to obtain older tweets; this is slightly more difficult than … Continue reading →

Read more »

Dataset: Tweets from the Chinese Protests #cn220

February 20, 2011
By
Dataset: Tweets from the Chinese Protests #cn220

  Earlier this week, I posted a ~100k tweet dataset on the #25bahman protests in Iran.  The corresponding figure of frequencies showed a strong presence on Twitter, with over 500 tweets per 5 minute period at peak.  You can download the … Continue reading →

Read more »