Faces of #rstats Twitter

[This article was first published on Maëlle, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This week I was impressed by this tweet where Daniel Pett, Digital Humanities Lead at the British Museum, presented a collage of Twitter profile pics of all his colleagues. He made this piece of art using R (for collecting the usernames) and Python. I’m a bit of an R fanatic (or a Python dummy…) so I decided to write a code in R only to make a collage of profile pics of Twitter #rstats users.

Note: if you’re interested in Daniel Pett’s code, the data gathering is here and the montage code is here.

Getting the users

Once again I’ll use rtweet! In case you’ve been using twitteR, note that twitteR maintainer is currently passing the torch to rtweet so you’d better switch packages.

library("rtweet")
library("dplyr")
library("magick")
library("httr")
library("stringr")
library("purrr")
users <- search_users(q= '#rstats',
                      n = 1000,
                      parse = TRUE)
users <- unique(users)

I think that doing this I got only profiles whose description includes “rstats” or “#rstats”. Moreover, although I got 1,000 results, many of them were duplicates, so in the end I had about 450 #rstats Twitter users.

Gathering profile pics

This is the part where I felt slightly creepy downloading pictures of so many people.

I first had to create the link to each person’s profile page.

users <- mutate(users,
                where = paste0("https://twitter.com/",
                               screen_name))
                               

Then I wrote a function for getting the profile pic for each user. I worked for everyone except 1) eggs and 2) a person whose profile pic had no extension, so I was quite satisfied with the result.

Update: Daniel Pett told me I could have used the column “profile_image_url” of the data.frame obtained from rtweet::search_users… I had completely missed that column! Thanks a lot!

get_piclink <- function(df){
  content <- httr::GET(df$where)
  content <- httr::content(content, as = "text")
  image <- str_extract(content,
                       "class=\"ProfileAvatar-image \" src=\"https://pbs.twimg.com/profile_images/.*\\..*\" alt")
  image <- str_replace(image, "class=\"ProfileAvatar-image \" src=\"", "")
  image <- str_replace(image, "..alt", "")
  return(image)
}

users <- by_row(users, get_piclink,
                .to = "piclink", .collate = "cols")

readr::write_csv(users, path = "users.csv")

After getting the links I could start saving the pictures, in a small 50px x 50px format. At that point I started using the magick package, a real game changer for image manipulation in R!

save_image <- function(df){
  image <- try(image_read(df$piclink), silent = TRUE)
  if(class(image)[1] != "try-error"){
    image %>%
      image_scale("50x50") %>%
      image_write(paste0("data/2017-03-19-facesofr_users_images/", df$screen_name,".jpg"))
  }
  
}

users <- filter(users, !is.na(piclink))
users <- split(users, 1:nrow(users))
walk(users, save_image)

I got 448 images.

Preparing the collage itself

I want to take a minute thinking about myself in elementary school. I am left-handed and I had scissors for right-handed people that I used with my left hand because no one in my family or school knew any better. I was therefore very bad at collages because of the cutting part. Had I known that I’d do collages with a computer as a grown-up, I would have been less sad to have to skip the break to keep on slowly cutting pieces of paper! (Note: I actually had a very happy childhood.)

Back to R talk now! For doing the collage I had to know how many columns and rows I wanted. For this I factorized the number of pics I had. Note that I also re-sampled the images, because I didn’t want all the R-Ladies chapters and their beautiful purple logo to end up near each other in the collage.

files <- dir("data/2017-03-19-facesofr_users_images/", full.names = TRUE)
set.seed(1)
files <- sample(files, length(files))
gmp::factorize(length(files))
## Big Integer ('bigz') object of length 7:
## [1] 2 2 2 2 2 2 7

Based on this I chose this format:

no_rows <- 28
no_cols <- 16

In magick there’s a function for appending images, image_append, from which you get either a row or a column depending on the stack argument. I first created all columns.

make_column <- function(i, files, no_rows){
  image_read(files[(i*no_rows+1):((i+1)*no_rows)]) %>%
  image_append(stack = TRUE) %>%
    image_write(paste0("cols/", i, ".jpg"))
}

walk(0:(no_cols-1), make_column, files = files,
    no_rows = no_rows)

And finally, I appended the columns!

image_read(dir("cols/", full.names = TRUE)) %>%
image_append(stack = FALSE) %>%
  image_write("2017-03-19-facesofr.jpg")

The end

Here is the result, where many, many R users are most certainly missing. Besides, can we talk about these accounts with the old R logo? They make me feel old, while the R-Ladies logo make me happy. Too bad Forwards wasn’t included, because their logo is really pretty.

Now you can try doing the same with your followers, the people you follow, or making a doormat with base plot advocates.

To leave a comment for the author, please follow the link and comment on their blog: Maëlle.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)