Using R to parse time (and taxon names) with GBIF’s API

[This article was first published on John Baumgartner's Research » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

GBIF has recently made a bunch of handy tools available via their revamped API. These tools include a species name parser, which seems very useful for cleaning long lists of taxon names.

Here’s a simple R function that takes a vector of taxon names and parses them using GBIF’s API, extracting, among other details, the genus, species, infraspecific rank and epithet, nothorank (i.e., indicating the taxonomic rank of hybridisation), and authorship.

I’ve created a gist of this function, so you can grab it from github with source_url('https://gist.github.com/johnbaums/6971353/raw/gbif_parse.R') (requires devtools package), or you can just copy and paste it from here.

It’s a bit awkward to include wide tabular output here, but I’ve provided a few examples of the function’s use on github. I haven’t tested the API thoroughly (and the stable version hasn’t yet been released – expected end of 2013), so I’m interested to hear if it “parses” your tests.


EDIT: I’ve added a modified version of this function to the dev version of ROpenSci’s rgbif package (thanks, Scott!).


Filed under: R Tagged: API, function, GBIF, gist, nomenclature, R, taxonomy, tools

To leave a comment for the author, please follow the link and comment on their blog: John Baumgartner's Research » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)