Capybara v1.8.0 is now available on CRAN

[This article was first published on pacha.dev/blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Because of delays with my scholarship payment, if this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee. It shall be used to continue my Open Source efforts. The full explanation is here: A Personal Message from an Open Source Contributor. If you play the electric guitar, the same scholarship chaos led me to turn my guitar pedals and DIY kits hobby into a business, and you can check those here.

Capybara started as an Alpaca clone that uses cpp11armadillo to be is a fast and small footprint software to fit GLMs with k-way fixed effects.

The software can estimate GLMs from the Exponential Family and also Negative Binomial models, using a demeaning/centering approach that offers a large speedup for models of a large number of fixed effects.

Here is a small benchmark for the following specification using a model from An Advanced Guide to Trade Policy Analysis:

where:

  • : exports from country to country at year
  • : Regional Trade Agreement between countries and at time
  • : RTA between countries and at time
  • : dummy variables taking the value of one for international trade for each year , and zero otherwise.
  • : exporter-year, importer-year, and exporter-importer fixed effects

To obtain the model coefficients I used the following formula with fixed effects:

form <- trade ~ rta + rta_lag4 + rta_lag8 + rta_lag12 +
  intl_border_1986 + intl_border_1990 + intl_border_1994 +
  intl_border_1998 + intl_border_2002 |
  exp_year + imp_year + pair_id_2

I used the same formula with Alpaca, Fixest and Capybara and the dataset from AGTPA, giving me the following time and memory results:

Package Median (s) Mem Alloc (MB)
Alpaca 7.17 573.0
Fixest 0.176 78.3
Capybara 0.612 24.4

Capybara would not exist without Alpaca and it is currently slower than Fixest. While Capybara can be improved, I am happy with its current memory efficiency.

You can install the current Capybara stable version with:

install.packages("capybara")

The official documentation is here.

I hope this is useful 🙂

To leave a comment for the author, please follow the link and comment on their blog: pacha.dev/blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)