Three (four?) R functions I enjoyed this week

[This article was first published on Maëlle's R blog on Maëlle Salmon's personal website, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There are already three functions of note on a piece of paper on my desk, so it’s time to blog about them!

How does this package depend on this other package? pak::pkg_deps_explain()

The pak package by Gábor Csárdi makes installing packages easier. If I need to start working on a package, I clone it, then run pak::pak() to install and update its dependencies. It’s a “convenience function” that is convenient for sure! Bye bye remotes::install_deps().

Anyway, pak is truly a treasure trove, even for challenges that happen less often. Earlier this week I was seeing an error in a GitHub Actions log due to a package that I didn’t know was a dependency of the package I was working on. It was clearly not a direct dependency, it was not listed in DESCRIPTION. It was a dependency of a dependency and I couldn’t guess which one it was… Luckily one doesn’t need to guess! Although I didn’t think of it right away, the pak::pkg_deps_explain() is my friend in such cases1!

For instance, if you are wondering why usethis depends on httr2,

pak::pkg_deps_explain("usethis", "httr2")
#> ℹ Loading metadata database
#> ✔ Loading metadata database ... done
#> 
#> usethis -> gh -> httr2

usethis depends on gh (client for GitHub API) that in turns depends on httr2.

Next time I encounter a similar problem I can only hope I’ll remember about this function sooner!

Where in the file are there non ASCII characters? tools::showNonASCIIfile()

If you get the dreaded R CMD check WARNING “Found the following file with non-ASCII characters” and can’t see at once what character that is in the file, you don’t need to comb through each line of code. You can simply run tools::showNonASCIIfile(<filename>). After that an easy fix can be to replace then with the \uxxxx escape as indicated in the WARNING. I’m not saying it’s always that easy but that’s what happened to me this week! I found the correct escape by using a search engine.

How do these two text files differ? tools::Rdiff() or gert::git_diff_patch()

I’m working on the babeldown R package that helps translate mostly Markdown files. Support for first-time automatic translation is there, but a next step is to add support for automatic update of a translation based on a git commit. Say I have a file in English and its translation in Spanish and a contributor edits part of the Spanish file. We want to have a function update the English file based on this diff. As part of that work I need to programmatically parse the difference between two text files.

I saw gert::git_diff() but I didn’t find the patch to be easily parsable.

I had better luck with tools::Rdiff() especially after a GitHub search showed me an example using it with Log = TRUE. Before that I was unsuccessfully trying capture.output() to capture what was printed… I should have read the arguments list more carefully.

original_lines <- c(
  "# Title", "",
  "## Subtitle", "",
  "Some info", "",
  "First line of a paragraph",
  "Second line of a paragraph"
)

amended_lines <- c(
  "# BIG Title", "",
  "## Subtitle", "",
  "Some info", "",
  "First line of a paragraph",
  "Second line of a paragraph",
  "More info"
)

file1 <- withr::local_tempfile()
file2 <- withr::local_tempfile()

brio::write_lines(original_lines, file1)
brio::write_lines(amended_lines, file2)

diff <- tools::Rdiff(file1, file2, Log = TRUE)
#> 
#> files differ in number of lines:
diff
#> $status
#> [1] 1
#> 
#> $out
#> [1] "files differ in number of lines" "1c1"                            
#> [3] "< # Title"                       "---"                            
#> [5] "> # BIG Title"                   "8a9"                            
#> [7] "> More info"

What I especially liked in this output, is the lines <line-numbers-in-original-file><letter-indicated-status><line-numbers-in-new-file> where the status can be a (added), d (deleted) or c (changed).

In comparison here’s what gert gives me.

dir <- withr::local_tempdir()
gert::git_init(dir)

brio::write_lines(original_lines, file.path(dir, "file.txt"))
gert::git_add("file.txt", repo = dir)
#>       file status staged
#> 1 file.txt    new   TRUE
gert::git_commit("first commit", repo = dir)
#> [1] "38d0abf2aee7169ddabab0e83676fdee190a136a"

brio::write_lines(amended_lines, file.path(dir, "file.txt"))
gert::git_add("file.txt", repo = dir)
#>       file   status staged
#> 1 file.txt modified   TRUE
commit_of_interest <- gert::git_commit("second commit", repo = dir)

cat(gert::git_diff_patch(commit_of_interest, repo = dir))
#> diff --git a/file.txt b/file.txt
#> index 6d79a01..44a4269 100644
#> --- a/file.txt
#> +++ b/file.txt
#> @@ -1,4 +1,4 @@
#> -# Title
#> +# BIG Title
#>  
#>  ## Subtitle
#>  
#> @@ -6,3 +6,4 @@ Some info
#>  
#>  First line of a paragraph
#>  Second line of a paragraph
#> +More info

I’m not exactly sure yet whether tools::Rdiff() will be enough for my use case but it might!

Conclusion

My sticky note mentioned pak::pkg_deps_explain(), tools::showNonASCIIfile() (what case is this 😅) and tools::Rdiff(), that helped me make progress on R code earlier this week. Time to break out a fresh sticky note and see what ends up on it!


  1. I first wrote it “pak_deps_explain” instead of “pkg_deps_explain”. 😬 ↩︎

To leave a comment for the author, please follow the link and comment on their blog: Maëlle's R blog on Maëlle Salmon's personal website.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)