Goodbye, Disqus! Hello, Utterances!

[This article was first published on Posts on Maëlle's R blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Removing Disqus from my blogdown blog had been on my mind for a while, ever since I saw Bob Rudis’ tweet enjoining Noam Ross to not use it for his brand-new website. The same Twitter thread introduced me to Utterances, a “lightweight comments widget built on GitHub issues”, which I have at last installed to my blog in lieu of Disqus. How did I manage to not lose anything of value? How easy was it to switch tools? Read on to learn more!

Was saying goodbye to Disqus hard?

Removing Disqus was neither emotionally nor technically hard.

Comments I (kinda) let go of

To deal with my fear of loss, I exported all the comments to an XML. It means I have a backup, that in theory I could explore within R! I’ll show some XML wrangling a bit later. The export does contain Disqus usernames and names which means I’m now responsible for person identifying information that Disqus collected.

Thanks a lot to commenters in the past system for taking the time to leave me a note: it helped me keep blogging!

The Disqus XML contains information about

  • posts that are in fact comments, not blog posts, along with their threading information i.e. both whose blog post the comment was on, and if relevant, which comment the comment is an answer to;

  • threads, one thread per comment but not necessarily one comment per thread, and potentially more than one thread per blog post. Each top-level comment is a thread, but a website page with no comment also gets a thread, obviously empty.

In the code below, I shall rectangle the information about threads (id, title) and about comments (message, author name, date, thread id, was it spam, was it deleted) using the xml2 package.

export <- xml2::read_xml("data/disqus-export.xml")
export

## {xml_document}
## <disqus schemaLocation="http://disqus.com/api/schemas/1.0/disqus.xsd http://disqus.com/api/schemas/1.0/disqus-internals.xsd" xmlns="http://disqus.com" xmlns:dsq="http://disqus.com/disqus-internals" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
##  [1] <category dsq:id="5516420">\n  <forum>masalmon</forum>\n  <title>Ge ...
##  [2] <thread dsq:id="5169974011">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [3] <thread dsq:id="5177145143">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [4] <thread dsq:id="5192091523">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [5] <thread dsq:id="5490361085">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [6] <thread dsq:id="5490455492">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [7] <thread dsq:id="5496622766">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [8] <thread dsq:id="5496655533">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [9] <thread dsq:id="5499530252">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [10] <thread dsq:id="5499566886">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [11] <thread dsq:id="5503605733">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [12] <thread dsq:id="5503629818">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [13] <thread dsq:id="5519522381">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [14] <thread dsq:id="5521172753">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [15] <thread dsq:id="5523856167">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [16] <thread dsq:id="5523862926">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [17] <thread dsq:id="5544585525">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [18] <thread dsq:id="5544919490">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [19] <thread dsq:id="5544919711">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [20] <thread dsq:id="5544920033">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## ...

Explaining how to parse XML is beyond the scope of this blog post, but let me just mention I used a search engine to answer questions such as “XPath extract nodes by name”, “xml2 namespace”. I reckon my code could be more elegant if I knew more XPath.

Let me start with threads.

# extract nodes corresponding to comments
threads_nodes <- xml2::xml_find_all(export, "d1:thread")
threads_nodes

## {xml_nodeset (283)}
##  [1] <thread dsq:id="5169974011">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [2] <thread dsq:id="5177145143">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [3] <thread dsq:id="5192091523">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [4] <thread dsq:id="5490361085">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [5] <thread dsq:id="5490455492">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [6] <thread dsq:id="5496622766">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [7] <thread dsq:id="5496655533">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [8] <thread dsq:id="5499530252">\n  <id/>\n  <forum>masalmon</forum>\n  ...
##  [9] <thread dsq:id="5499566886">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [10] <thread dsq:id="5503605733">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [11] <thread dsq:id="5503629818">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [12] <thread dsq:id="5519522381">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [13] <thread dsq:id="5521172753">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [14] <thread dsq:id="5523856167">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [15] <thread dsq:id="5523862926">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [16] <thread dsq:id="5544585525">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [17] <thread dsq:id="5544919490">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [18] <thread dsq:id="5544919711">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [19] <thread dsq:id="5544920033">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## [20] <thread dsq:id="5544931944">\n  <id/>\n  <forum>masalmon</forum>\n  ...
## ...

# extract interesting information from each node,
# information that is potentially nested
threads <- tibble::tibble(
  thread_id = xml2::xml_attr( threads_nodes, 
                              "id"),
  title = xml2::xml_text(
    xml2::xml_find_all(threads_nodes, "d1:title")
    ))

Now on to comments…

# extract nodes corresponding to comments
comments_nodes <- xml2::xml_find_all(export, "d1:post") 
comments_nodes

## {xml_nodeset (206)}
##  [1] <post dsq:id="2916181700">\n  <id/>\n  <message><![CDATA[<p>La poss ...
##  [2] <post dsq:id="3118121105">\n  <id/>\n  <message><![CDATA[<p>Great b ...
##  [3] <post dsq:id="3118236082">\n  <id/>\n  <message><![CDATA[<p>Nice Ma ...
##  [4] <post dsq:id="3118757895">\n  <id/>\n  <message><![CDATA[<p>Thanks  ...
##  [5] <post dsq:id="3118758593">\n  <id/>\n  <message><![CDATA[<p>Thanks  ...
##  [6] <post dsq:id="3118783826">\n  <id/>\n  <message><![CDATA[<p>I love  ...
##  [7] <post dsq:id="3118785805">\n  <id/>\n  <message><![CDATA[<p>Thanks! ...
##  [8] <post dsq:id="3121124991">\n  <id/>\n  <message><![CDATA[<p>Chouett ...
##  [9] <post dsq:id="3121137475">\n  <id/>\n  <message><![CDATA[<p>Oops qu ...
## [10] <post dsq:id="3121945521">\n  <id/>\n  <message><![CDATA[<p>I poste ...
## [11] <post dsq:id="3122653372">\n  <id/>\n  <message><![CDATA[<p>Another ...
## [12] <post dsq:id="3122759373">\n  <id/>\n  <message><![CDATA[<p>Thank y ...
## [13] <post dsq:id="3123851616">\n  <id/>\n  <message><![CDATA[<p>Great s ...
## [14] <post dsq:id="3126340152">\n  <id/>\n  <message><![CDATA[<p>Once ag ...
## [15] <post dsq:id="3126343942">\n  <id/>\n  <message><![CDATA[<p>Thank y ...
## [16] <post dsq:id="3140125477">\n  <id/>\n  <message><![CDATA[<p>The ide ...
## [17] <post dsq:id="3140136118">\n  <id/>\n  <message><![CDATA[<p>Thank y ...
## [18] <post dsq:id="3140161542">\n  <id/>\n  <message><![CDATA[<p>Not rel ...
## [19] <post dsq:id="3140329480">\n  <id/>\n  <message><![CDATA[<p>Yes def ...
## [20] <post dsq:id="3150961786">\n  <id/>\n  <message><![CDATA[<p>That's  ...
## ...

# extract interesting information from each node,
# information that is potentially nested
comments <- tibble::tibble(
  thread_id = xml2::xml_attr(
    xml2::xml_find_all(comments_nodes, "d1:thread"), 
    "id"),
  message = xml2::xml_text(
    xml2::xml_find_all(comments_nodes, "d1:message")
    ),
  date =  xml2::xml_text(
    xml2::xml_find_all(comments_nodes, "d1:createdAt")
    ),
  deleted = xml2::xml_text(
    xml2::xml_find_all(comments_nodes, "d1:isDeleted")
    ),
  spam = xml2::xml_text(
    xml2::xml_find_all(comments_nodes, "d1:isSpam")
    ),
  author = xml2::xml_text(
    xml2::xml_find_all(
    xml2::xml_find_all(comments_nodes, "d1:author"), 
    "d1:name")))
  
comments <- dplyr::mutate(comments,
                          date = anytime::anydate(date),
                          deleted = deleted == "true",
                          spam = spam == "true")

I was then able to join the two tables.

comments <- dplyr::left_join(comments, threads,
                             by = "thread_id")
readr::write_csv(comments, "comments.csv")

comments[2:12, c("message", "title")]

## # A tibble: 11 x 2
##    message                                        title                    
##    <chr>                                          <chr>                    
##  1 "<p>Great blog post, thank you! I really appr… French villages and a so…
##  2 <p>Nice Maëlle!</p><p>Good, interesting quest… French villages and a so…
##  3 <p>Thanks Nick!</p>                            French villages and a so…
##  4 <p>Thanks Lisa! Toponomy is quite fascinating… French villages and a so…
##  5 "<p>I love this! I wonder, would it be possib… French villages and a so…
##  6 <p>Thanks! I got a similar question on Twitte… French villages and a so…
##  7 "<p>Chouette! Mais la mer c'est \"sea\", \"se… French villages and a so…
##  8 <p>Oops quelle faute de frappe idiote, je la … French villages and a so…
##  9 "<p>I posted an update! <a href=\"http://www.… French villages and a so…
## 10 <p>Another good one, Maëlle! It's a great exa… More water, a bit more a…
## 11 <p>Thank you! Lol on ninja grep skills, I wis… More water, a bit more a…

I got a pretty nice rectangle in the end, that could be used for some text analysis, but in my case, I mostly view it as memorabilia. I hope any comment warranting action had been tackled. I wrote 93 (sum(comments$author == "Maëlle Salmon")) out of the 206 comments (Disqus classified a few comments as spam, 15 to be exact, of which a few were informative comments. ), because I tend to answer comments, if only with a simple thank you! Now that I have GitHub issues as comments, I could also answer with emojis.

It was fun because I didn’t even remember about some notes I had gotten. I hope you don’t feel slighted, dear reader, now that I keep these old comments to myself! Be happy you’re no longer tracked.

Getting rid of Disqus

I removed Disqus from my website, and my website from Disqus!

To remove Disqus from a Hugo website, one can, depending on the website’s theme:

  • use Hugo’s Disqus GDPR setting if the website theme uses Hugo’s built-in Disqus template (i.e. if there’s a mention of a internal Disqus partial, not of a Disqus partial stored somewhere in the theme files);

  • or remove one’s Disqus username from the config file, or remove the Disqus partial mention in the post template. To do that, the best process is not to edit the e.g. post/single.html file in the theme folder, but instead to save it and edit it under layouts/post/single.html. Refer to the blogdown book for more information.

After that, there should be no mention of Disqus scripts in your website source.

Then, to delete my website from my Disqus account, following Disqus guidance, I logged into my account and, after exporting my website’s comments, confirmed I wanted to do that and voilà!

Adding Utterances

To add Utterances, a “lightweight comments widget built on GitHub issues”, I also followed official guidance:

Here’s the Utterances script tag for my blog. The label needs to have been created for your website repotterances can only use existing issue labels.

<script src="https://utteranc.es/client.js"
        repo="maelle/simplymaelle"
        issue-term="title"
        label="comments :speech_balloon:"
        theme="github-light"
        crossorigin="anonymous"
        async>
</script>

I then wrote a comment under one of my posts to make sure things worked. I also took a few seconds to add my site to the list of sites using Utterances. Overall, I found the process easy, as I had the first time I installed Utterances on a blogdown website.

I am aware that a downside of Utterances is that it is tied to GitHub. However, I am not afraid of switching to yet another tool in the future, and hope not too many potential commenters will be turned off by the necessity to use a GitHub account to comment.

Conclusion

In this post I explained how I removed Disqus from my blog and vice versa, and how I installed Utterances instead. What I did not do was analyzing the Disqus data. I moreover did not try to transfer old comments to the new system, partly because that’d mean removing commenters’ ownership on what they had written.

In his tweet that motivated me to say goodbye to Disqus, Bob wrote “Yet another service where you and your site visitors are the product. Plus it acts like malicious javascript.”. I therefore feel a bit better about this website! Now, I didn’t have a ton of comments, and I don’t have a real need for moderating comments, so maybe you’ll make a different decision for your website. I’d be glad to hear about it… in the comments section below! Also feel free to tell me if you explore your own Disqus data in a more thorough way than I!

To leave a comment for the author, please follow the link and comment on their blog: Posts on Maëlle's R blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)