Some Code for Dumping Data from Twitter Gardenhose

[This article was first published on Byte Mining, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Gardenhose is a Streaming API feed that continuously sends a sample (roughly 15% according to Ryan Sarver at the 140tc in September 2009) of all tweets to feed recipients. This is some code for dumping the tweets to files named by date and hour. It is in PHP which is not my favorite language, but works nonetheless. I received a few requests to post it, so here it is.

 <?php

//gardenhosedump.php
$username = '';
$password = '';

while(true) {
         $file = fopen("http://" . $username . ":" . $password . "@stream.twitter.com/1/statuses/sample.json","r");

         while($data = fgets($file))
         {
             $time = @date("YmdH");
             if ($newTime!=$time)
             {
                 @fclose($file2);
                 $file2 = fopen("{$time}.txt","a");
             }
             fputs($file2,$data);
             $newTime = $time;
         }
        //need to close the file, but only if it is open!
        try {
                @fclose($file);
        } catch (MyException $e) {}
        try {
                @fclose($file2);
        }
        catch (MyException $e) {}

}
?>

To leave a comment for the author, please follow the link and comment on their blog: Byte Mining.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)