Extracting data from Twitter for @hrbrmstr’s #nom foodie images

January 15, 2018
By

(This article was first published on Jasmine Dumas' R Blog, and kindly contributed to R-bloggers)

Bob Rudis (@hrbrmstr) is a famed expert, author and developer in Data Security and the Chief Security Data Scientist at Rapid7. Bob also creates the most deliciously vivid images of his meals documented by the #nom hashtag. I’m going to use a similar method used in my previous projects (Hipster Veggies & Machine Learning Flashcards) to wrangle all those images into a nice collection – mostly for me to look at for inspiration in recipe planning.

Source Repository: jasdumas/bobs-noms

Analysis

library(rtweet) # devtools::install_github("mkearney/rtweet")
library(tidyverse)
library(dplyr)
library(stringr)
library(magick)
library(knitr)
library(kableExtra)
# get all of bob's recent tweets
bobs_tweets <- get_timeline(user = "hrbrmstr", n = 3200)

#filter noms with images only
bobs_noms <- 
  bobs_tweets %>% dplyr::filter(str_detect(hashtags, "nom"), !is.na(media_url))
bobs_noms$clean_text <- bobs_noms$text
bobs_noms$clean_text <- str_replace(bobs_noms$clean_text,"#[a-zA-Z0-9]{1,}", "") # remove the hashtag
bobs_noms$clean_text <- str_replace(bobs_noms$clean_text, " ?(f|ht)(tp)(s?)(://)(.*)[.|/](.*)", "") # remove the url link
bobs_noms$clean_text <- str_replace(bobs_noms$clean_text, "[[:punct:]]", "") # remove punctuation
# let's look at these images in a smaller data set
bobs_noms_small <- bobs_noms %>% select(created_at, clean_text, media_url)

bobs_noms_small$img_md <- paste0("![", bobs_noms_small$clean_text, "](", bobs_noms_small$media_url, ")")
data.frame(images = bobs_noms_small$img_md) %>% 
kable( format = "markdown") %>%
  kable_styling(full_width = F, position = 'center') 

|images |
|:———————————————————————————————————————————————————-|
|Moroccaninspired lamb meatballs prepped. Naan dough is kneading. Going to be a  sup tonight. |
|Tsukune with tare tonight |
|Lamb roast isnt too shabby either |
|The pain de mie thankfully came out well |
|Sage rosemary & espresso infused salt rubbed roast lamb. Goose fat roasted potatoes _almost _ done |
| |
|Ham amp; turkey frittata time! |
|Postconfit |
|PostPBC |
| |
| is home
#2's Wedding Sunday.
20 ppl over tonight for ?
#joy
#nom |
|Definitely an Indonesian spring rolls kind of night |
|Homemade breadsticks for the homemade pasta and meatballs tonight |
| |
|Bonein PBC smoked pork roast |
|Prosciutto de Parma Cacio di Bosco & spinach omelettes this morning |
|Our Friday night is shaping up well How’s yours going? |
|Pork tenderloin on the PBC tonight |
|Overnight nutmeg-infused yeast waffles with sautéd local picked Maine apples & Maine maple syrup |

# create a function to save these images!
save_image <- function(df){
  for (i in c(1:nrow(df))){
    image <- try(image_read(df$media_url[[i]]), silent = F)
  if(class(image)[1] != "try-error"){
    image %>%
      image_scale("1200x700") %>%
      image_write(paste0("../post_data/data/", bobs_noms$clean_text[i],".jpg"))
  }
 
  }
   cat("saved images...\n")
}

save_image(bobs_noms)
## saved images...

To leave a comment for the author, please follow the link and comment on their blog: Jasmine Dumas' R Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)