Site icon R-bloggers

Better Git diff with difftastic

[This article was first published on Maëlle's R blog on Maëlle Salmon's personal website, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m currently on a quest to better know and understand treesitter-based tooling for R. To make it short, treesitter is a tool for parsing code, for instance recognizing what is a function, an argument, a logical in a string of code. With tools built upon treesitter you can search, reformat, lint and fix, etc. your code. Exciting stuff, running locally and deterministically on your machine.

Speaking of “etc.”, Etienne Bacher helpfully suggested I also look at treesitter-based tooling for other languages to see what’s still missing in our ecosystem. This is how I stumbled upon difftastic by Wilfred Hughes, “a structural diff tool that understands syntax”. ✨ This means that difftastic doesn’t only compare line or “words” but actual syntax by looking at lines around the lines that changed (by default, 3), Even better, it understands R out of the box1.

Many thanks to Etienne Bacher not only for making me discover difftastic but also for useful feedback on this post!

Installing difftastic

To install difftastic I downloaded a binary file for my system from the releases of the GitHub repository, as documented in the manual.

difftastic on two files

You can run difftastic on two files, a bit like you would use the waldo R package on two objects.

Let’s compare:

a <- gsub("bad", "good", x)

to

a <- stringr::str_replace(x, "bad", "good")

respectedly saved in old.R and new.R. The CLI is called difft not difftastic. I use the “inline” display rather than the two columns default in order to save horizontal space.

difft old.R new.R --display inline

We’d get to this nice looking diff:

The parentheses and "bad" and "good" arguments are ignored.

We can also get the JSON version of this diff, which is an unstable feature which usage requires setting an environment variable:

export DFT_UNSTABLE=yes
difft old.R new.R --display json

This gets us

{"aligned_lines":[[0,0],[1,1]],"chunks":[[{"lhs":{"line_number":0,"changes":[{"start":5,"end":9,"content":"gsub","highlight":"normal"},{"start":23,"end":24,"content":",","highlight":"normal"},{"start":25,"end":26,"content":"x","highlight":"normal"}]},"rhs":{"line_number":0,"changes":[{"start":5,"end":12,"content":"stringr","highlight":"normal"},{"start":12,"end":14,"content":"::","highlight":"keyword"},{"start":14,"end":25,"content":"str_replace","highlight":"normal"},{"start":26,"end":27,"content":"x","highlight":"normal"},{"start":27,"end":28,"content":",","highlight":"normal"}]}}]],"language":"R","path":"content/post/2026-03-26-difftastic/new.R","status":"changed"}

Now, none of this isn’t very useful because I would never compare files in this way… I use version control!

difftastic with Git

We can set difftastic as the external diff tool for Git globally or for the current project.

For instance with the gert R package, to set it locally:

gert::git_config_set("diff.external", "difft")

If I want to use the inline display I’d set:

gert::git_config_set("diff.external", "difft --display inline")

Then git diff will by default use difftastic. Most interestingly for me, git show --ext-diff will use difftastic. I never use git diff directly but I do look at more or less recent commits a lot.

Say I am interested in the commit that removed roxygen2’s dependency on stringi, I’ll run:

git show 7a1dd39866699a2b0a034bb15244c07698a1e2e7 --ext-diff

and get:

This isn’t spectacular because this is a small diff, but I enjoy the highlighting of the parentheses of the removed nested call, and of the logical.

Cool features of difftastic

Building on two examples of the difftastic homepage

Ignoring formatting changes

Since formatters can so helpfully apply your formatting preferences, reviewing formatting changes in a patch that’s about something else entirely is useless and annoying. Imagine having a function definition that fits on a single line, then adding one argument to it.

Going from

f <- function(myarg1 = foo, myarg2 = bar) {}

to

f <- function(
  myarg1 = foo,
  myarg2 = bar,
  myarg3 = baz
) {}

Because the definition is now longer than 80 characters, your formatter might switch the definition to be on multiple lines. But the actually interesting change is the addition of one argument.

Native Git diff2 would show:

Git with difftastic would show:

The matching of delimiters is why I found the difftastic’s display of the roxygen2 commit more pleasing.

Matching delimiters in wrappers

The Git diff can look a bit ugly when you simply move code from one function to the other.

Say we go from

f <- function() {
  1 + 1
}

to

f <- function() {
  g()
}

g <- function() {
  1 + 1
}

Git diff would show:

Whereas Git with difftastic would show:

Will I use difftastic?

I really like the concept behind difftastic and the few Git commits I looked at with it rendered nicely. Now, what’s missing for me to use difftastic a lot is its integration with the tools where I actually use Git:

In any case, I’ll continue learning about tools based on treesitter, some of which like Air and Jarl I can already use directly from my IDE. 😸

< section class="footnotes" role="doc-endnotes">
  1. It’s not every day we R developers look at the homepage of a tool and see the R logo among the logos of other languages! ↩︎

  2. To get the diff that Git would show me I ran git diff --no-index old-args.R new-args.R --no-ext-diff, cool trick I didn’t know about! Very glad I didn’t have to create a fake Git repo just for this. (--no-ext-diff because my diff in this repo would use difftastic by default!) ↩︎

To leave a comment for the author, please follow the link and comment on their blog: Maëlle's R blog on Maëlle Salmon's personal website.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version