Extracting data from Twitter for @hrbrmstr’s #nom foodie images

[This article was first published on Jasmine Dumas' R Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Bob Rudis (@hrbrmstr) is a famed expert, author and developer in Data Security and the Chief Security Data Scientist at Rapid7. Bob also creates the most deliciously vivid images of his meals documented by the #nom hashtag. I’m going to use a similar method used in my previous projects (Hipster Veggies & Machine Learning Flashcards) to wrangle all those images into a nice collection – mostly for me to look at for inspiration in recipe planning.

Source Repository: jasdumas/bobs-noms

Analysis

<span class="n">library</span><span class="p">(</span><span class="n">rtweet</span><span class="p">)</span><span class="w"> </span><span class="c1"># devtools::install_github("mkearney/rtweet")</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">dplyr</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">stringr</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">magick</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">knitr</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">kableExtra</span><span class="p">)</span><span class="w">
</span>
<span class="c1"># get all of bob's recent tweets</span><span class="w">
</span><span class="n">bobs_tweets</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_timeline</span><span class="p">(</span><span class="n">user</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"hrbrmstr"</span><span class="p">,</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">3200</span><span class="p">)</span><span class="w">

</span><span class="c1">#filter noms with images only</span><span class="w">
</span><span class="n">bobs_noms</span><span class="w"> </span><span class="o"><-</span><span class="w"> 
  </span><span class="n">bobs_tweets</span><span class="w"> </span><span class="o">%>%</span><span class="w"> </span><span class="n">dplyr</span><span class="o">::</span><span class="n">filter</span><span class="p">(</span><span class="n">str_detect</span><span class="p">(</span><span class="n">hashtags</span><span class="p">,</span><span class="w"> </span><span class="s2">"nom"</span><span class="p">),</span><span class="w"> </span><span class="o">!</span><span class="nf">is.na</span><span class="p">(</span><span class="n">media_url</span><span class="p">))</span><span class="w">
</span>
<span class="n">bobs_noms</span><span class="o">$</span><span class="n">clean_text</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">bobs_noms</span><span class="o">$</span><span class="n">text</span><span class="w">
</span><span class="n">bobs_noms</span><span class="o">$</span><span class="n">clean_text</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">str_replace</span><span class="p">(</span><span class="n">bobs_noms</span><span class="o">$</span><span class="n">clean_text</span><span class="p">,</span><span class="s2">"#[a-zA-Z0-9]{1,}"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">)</span><span class="w"> </span><span class="c1"># remove the hashtag</span><span class="w">
</span><span class="n">bobs_noms</span><span class="o">$</span><span class="n">clean_text</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">str_replace</span><span class="p">(</span><span class="n">bobs_noms</span><span class="o">$</span><span class="n">clean_text</span><span class="p">,</span><span class="w"> </span><span class="s2">" ?(f|ht)(tp)(s?)(://)(.*)[.|/](.*)"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">)</span><span class="w"> </span><span class="c1"># remove the url link</span><span class="w">
</span><span class="n">bobs_noms</span><span class="o">$</span><span class="n">clean_text</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">str_replace</span><span class="p">(</span><span class="n">bobs_noms</span><span class="o">$</span><span class="n">clean_text</span><span class="p">,</span><span class="w"> </span><span class="s2">"[[:punct:]]"</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">)</span><span class="w"> </span><span class="c1"># remove punctuation</span><span class="w">
</span>
<span class="c1"># let's look at these images in a smaller data set</span><span class="w">
</span><span class="n">bobs_noms_small</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">bobs_noms</span><span class="w"> </span><span class="o">%>%</span><span class="w"> </span><span class="n">select</span><span class="p">(</span><span class="n">created_at</span><span class="p">,</span><span class="w"> </span><span class="n">clean_text</span><span class="p">,</span><span class="w"> </span><span class="n">media_url</span><span class="p">)</span><span class="w">

</span><span class="n">bobs_noms_small</span><span class="o">$</span><span class="n">img_md</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">paste0</span><span class="p">(</span><span class="s2">"!["</span><span class="p">,</span><span class="w"> </span><span class="n">bobs_noms_small</span><span class="o">$</span><span class="n">clean_text</span><span class="p">,</span><span class="w"> </span><span class="s2">"]("</span><span class="p">,</span><span class="w"> </span><span class="n">bobs_noms_small</span><span class="o">$</span><span class="n">media_url</span><span class="p">,</span><span class="w"> </span><span class="s2">")"</span><span class="p">)</span><span class="w">
</span>
<span class="n">data.frame</span><span class="p">(</span><span class="n">images</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">bobs_noms_small</span><span class="o">$</span><span class="n">img_md</span><span class="p">)</span><span class="w"> </span><span class="o">%>%</span><span class="w"> 
</span><span class="n">kable</span><span class="p">(</span><span class="w"> </span><span class="n">format</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"markdown"</span><span class="p">)</span><span class="w"> </span><span class="o">%>%</span><span class="w">
  </span><span class="n">kable_styling</span><span class="p">(</span><span class="n">full_width</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">F</span><span class="p">,</span><span class="w"> </span><span class="n">position</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'center'</span><span class="p">)</span><span class="w"> 
</span>

|images |
|:———————————————————————————————————————————————————-|
|Moroccaninspired lamb meatballs prepped. Naan dough is kneading. Going to be a  sup tonight. |
|Tsukune with tare tonight |
|Lamb roast isnt too shabby either |
|The pain de mie thankfully came out well |
|Sage rosemary & espresso infused salt rubbed roast lamb. Goose fat roasted potatoes _almost _ done |
| |
|Ham amp; turkey frittata time! |
|Postconfit |
|PostPBC |
| |
| is home
#2's Wedding Sunday.
20 ppl over tonight for ?
#joy
#nom |
|Definitely an Indonesian spring rolls kind of night |
|Homemade breadsticks for the homemade pasta and meatballs tonight |
| |
|Bonein PBC smoked pork roast |
|Prosciutto de Parma Cacio di Bosco & spinach omelettes this morning |
|Our Friday night is shaping up well How’s yours going? |
|Pork tenderloin on the PBC tonight |
|Overnight nutmeg-infused yeast waffles with sautéd local picked Maine apples & Maine maple syrup |

<span class="c1"># create a function to save these images!</span><span class="w">
</span><span class="n">save_image</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">df</span><span class="p">){</span><span class="w">
  </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="n">nrow</span><span class="p">(</span><span class="n">df</span><span class="p">))){</span><span class="w">
    </span><span class="n">image</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">try</span><span class="p">(</span><span class="n">image_read</span><span class="p">(</span><span class="n">df</span><span class="o">$</span><span class="n">media_url</span><span class="p">[[</span><span class="n">i</span><span class="p">]]),</span><span class="w"> </span><span class="n">silent</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">F</span><span class="p">)</span><span class="w">
  </span><span class="k">if</span><span class="p">(</span><span class="nf">class</span><span class="p">(</span><span class="n">image</span><span class="p">)[</span><span class="m">1</span><span class="p">]</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="s2">"try-error"</span><span class="p">){</span><span class="w">
    </span><span class="n">image</span><span class="w"> </span><span class="o">%>%</span><span class="w">
      </span><span class="n">image_scale</span><span class="p">(</span><span class="s2">"1200x700"</span><span class="p">)</span><span class="w"> </span><span class="o">%>%</span><span class="w">
      </span><span class="n">image_write</span><span class="p">(</span><span class="n">paste0</span><span class="p">(</span><span class="s2">"../post_data/data/"</span><span class="p">,</span><span class="w"> </span><span class="n">bobs_noms</span><span class="o">$</span><span class="n">clean_text</span><span class="p">[</span><span class="n">i</span><span class="p">],</span><span class="s2">".jpg"</span><span class="p">))</span><span class="w">
  </span><span class="p">}</span><span class="w">
 
  </span><span class="p">}</span><span class="w">
   </span><span class="n">cat</span><span class="p">(</span><span class="s2">"saved images...\n"</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="n">save_image</span><span class="p">(</span><span class="n">bobs_noms</span><span class="p">)</span><span class="w">
</span>
## saved images...

To leave a comment for the author, please follow the link and comment on their blog: Jasmine Dumas' R Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)