Probot: building a Mastodon bot

[This article was first published on Rstats – quantixed, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I have long admired albums2hear, a Twitter bot that posts albums. You can read a bit more about it here. There was no mastodon equivalent and so I decided to build one.

You can follow the bot – currently called Albums Albums Albums (or AlbumsX3) – here.

Idea behind the bot

The idea is to periodically post an album. The toot is simple the artist, title and year, include the cover, and… that’s it! People can leave a comment if they love the album, discover something new or whatever. The idea is just to put an album suggestion into people’s feeds so that they might be inspired to hear something new or revisit a classic album.

I wanted it to post albums that I would recommend. So this is where the build starts…

Use R to make a list of recommended albums

I have a method for importing my iTunes/Music library as XML into R (I plan to write this up in the future). From this import, I reasoned that I can grab the albums where I have listened to it more than once and that will do as a recommendation. There were about 1.4K albums with mean plays greater than 1. I filtered out singles, EPs, compilations, various artists, bootlegs, ROIO and unofficial stuff. This left me with a list of about 1K albums, in a data frame called album.

From here, making a data frame of Artist, Album and Year is easy, but I also needed to get the album artwork. I used the following to find the location of the first track from each album on my server, and then create a unique/safe name for each image file.

# get a list of files, one from each album. We'll take the first.
first_file <- cbind(album, filepath = album_tracks[match(album$Key, album_tracks$Key),"Location"])
# in the data frame filepath has %20s etc.
library(urltools)
# change the url/uri of filepath into a real filepath
file_list <- url_decode(first_file$filepath)
# remove the preceding file:// to leave `/share/name/path`
file_list <- gsub("file\\:\\/\\/","",file_list)
# append column to data frame
first_file$file_list <- file_list
# for the name of the image that we will extract from the first file, we need to use a safe name
# `Key` is a paste of artist and album. It will be unique but let's make safe for command line.

makeSafeFileName <- function(x)  {
  y <- gsub("[[:alpha:]]","",x)
  strlen <- nchar(y)
  if(strlen > 0) {
    for(i in 1:strlen) {
      chrToReplace <- substr(y,i,i)
      x <- sub(chrToReplace,"",x, fixed = TRUE)
    }
  }
  # very long strings should be truncated
  if(nchar(x) > 28) {
    x <- substr(x,1,28)
  }
  return(paste0(x,".jpg"))
}

first_file$img_name <- sapply(X = first_file$Key,FUN = makeSafeFileName)

img_df <- data.frame(file_list = file_list,
                     img_name = first_file$img_name)
# write data to file - we will use this to extract the artwork
write.csv(img_df,"Output/Data/filelist.txt", row.names = F)

Now I had a list of one track from each album and a corresponding image file name. I also made a csv for the bot to be used to compose the toots.

# now make data frame for bot
bot_df <- data.frame(artist = first_file$TheArtist,
                     album = first_file$Album,
                     year = first_file$Year,
                     img_name = first_file$img_name)
write.csv(bot_df,"Output/Data/bot_df.csv", row.names = F)

Extracting the artwork

All of my music files have the artwork embedded and it is possible to retrieve this using ffmpeg. However, my shell scripting game is a bit weak, so I simply edited the filelist.txt file so that each pair of file_list and img_name became the two arguments in:

ffmpeg -i input.mp3 -map 0:1 output.jpg

and processed the whole thing as a huge multiline command.

Now I had bot_df.csv and a folder full of images. Time to build the bot!

Setting up a Mastodon bot

There’s a great, simple guide by Terence Eden which I followed. It’s from 2018 but still works as described. Briefly, I signed up for a botsin.space account and set up an app. The account approval took a few days (it was the weekend) but otherwise this was straightforward. I generated some artwork for the banner and avatar and was ready to start posting.

Python script for posting

The csv of the data frame and this script are in a directory, with a subdirectory called img

Below is the script (modified to remove the token) I am using. It selects a random row to post. This means that duplicate posts will happen, but I didn’t try too hard to find a way around this. I might revisit it in the future. Each time it reads in the data frame. Again, I couldn’t think of a better way to do this. The bot takes about 4 s to generate a post, but that fine with only 4 posts a day.

import pandas as pd
from mastodon import Mastodon
import os

# relative path of data
dfFile = os.path.realpath(os.path.join(os.path.dirname(__file__), '..', 'bot_df.csv'))

# import the data into data frame
df = pd.read_csv(dfFile, sep=",")

# select a random row of data frame
theRow = df.sample()

# build the text string - this will be the message in the post.
textString = " - ".join([theRow['artist'].loc[theRow.index[0]],
                        theRow['album'].loc[theRow.index[0]],
                        str(theRow['year'].loc[theRow.index[0]])])
# add hashtags
textString = textString + "\n#Music #AlbumSuggestions #NowPlaying" 

# build image path
imgPath = os.path.realpath(os.path.join(os.path.dirname(__file__), 'img', theRow['img_name'].loc[theRow.index[0]]))

# write apologetic alt text
altText = "The image shows the album cover. Sorry for lack of a better description; I am just a bot!"

# Set up Mastodon
mastodon = Mastodon(
    access_token = 'foobar',
    api_base_url = 'https://botsin.space/'
)

media = mastodon.media_post(imgPath, "image/jpeg", description=altText)
mastodon.status_post(textString, media_ids=media)

And that’s it! I tested the bot would work using my mac, but ultimately it is running on a Raspberry Pi zero that also doubles up as a weather station.

Move everything to the Pi

I zipped everything and, using SFTP, copied it over to the Pi and extracted it on the other side. I ran the script to check that it posted to Mastodon and all was good.

I am using cron to trigger the python script to generate a post. I am posting at 4 times during the (UK) day, so it was a straightforward matter of adding four lines to crontab. My initial tests of the bot, triggering it from the command line all worked fine; on macand on the Pi. While they were working, my first version of the script used relative paths. When running from cron, the script failed. This was fixable by making the script more robust (this is the os command stuff in the script). If you are struggling at this step I advise triggering the script from the root directory using the long path to the script and troubleshoot from there.

Room for improvement

I am not using image descriptions for the album covers, which I am not happy about.

If the power is cut, the bot comes back to life on a restart, but it is possible that the Pi can crash or lose internet access. I don’t have a good solution here. My current setup is to a) follow the account on my main mastodon account (it is possible to set a notification when an account posts something too) and b) follow it as an rss feed in feedly. I figure that I will notice one of these methods if it stops posting.

I don’t currently have anything setup to deal with people replying to it or messaging the bot. There are a few automated methods out there but I haven’t yet explored any.

The post title comes from the album “Probot” by Probot. It’s a Dave-Grohl-plus-guests heavy metal side project which features some great tunes.

To leave a comment for the author, please follow the link and comment on their blog: Rstats – quantixed.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)