Articles by mjbommar

A quick look at #march11 / #saudi tweets

March 12, 2011 | mjbommar

Well, so much for that #march11 #Saudi day of rage.  Whether it was really the "tempest in a teacup" that  Prince Al-Waleed suggested on CNBC (video below, transcript here) or not, the oil complex and Saudi markets seem to have shrugged … Continue reading → [Read more...]

Dataset: Wisconsin Union Protester Tweets #wiunion

February 21, 2011 | mjbommar

   I’ve been playing with Twitter data over the last week, archiving Algerian, Egyptian, Iranian, and Chinese tweets.  I thought I’d bring the story a little closer to home this time by archiving tweets from Wisconsin Union protesters on the … Continue reading → [Read more...]

Dataset: Tweets from the Chinese Protests #cn220

February 20, 2011 | mjbommar

  Earlier this week, I posted a ~100k tweet dataset on the #25bahman protests in Iran.  The corresponding figure of frequencies showed a strong presence on Twitter, with over 500 tweets per 5 minute period at peak.  You can download the … Continue reading → [Read more...]

R Bloggers: The Site I Wish Existed in 2007

February 19, 2011 | mjbommar

  My first experience with R was in 2007 as a sophomore in undergrad.  As part of a larger project on pricing day-ahead electricity futures, I wanted to cluster locational marginal price (LMP) data from the ISO-NE.  Something like k-means is easy … Continue reading → [Read more...]

Pre-processing text: R/tm vs. python/NLTK

February 16, 2011 | mjbommar

  Let’s say that you want to take a set of documents and apply a computational linguistic technique.  If your method is based on the bag-of-words model, you probably need to pre-process these documents first by segmenting, tokenizing, stripping, stopwording, and … Continue reading → [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)