New features in the most recent taxize update, v0.2

[This article was first published on rOpenSci Blog - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

We just released a new version of taxize – version 0.2.0. This release contains a number of new features, and bug fixes. Here is a run down of some of the changes:

First, install and load taxize

install.packages("rgbif")

library(taxize)

New things

New functions: class2tree

Sometimes you just want to have a visual of the taxonomic relationships among taxa. If you don't know how to build a molecular phylogeny, don't have time, or there just isn't molecular data, you can sorta build one using taxonomy. Building on our classification function, you can get a bunch of taxonomic hierarchies from the classification function, then pass them to the new function class2tree. Like so:

Define a species list

spnames <- c("Latania lontaroides", "Randia cubana", "Blumea brevipes", "Commelina erecta", 
    "Miconia pyramidalis", "Aquilegia moorcroftiana", "Acridocarpus austrocaledonicus", 
    "Vaccinium wrightii", "Riocreuxia flanaganii", "Macroditassa adnata", "Acianthera ochreata", 
    "Spathodea campanulata", "Leucadendron salicifolium", "Habenaria fluminensis", 
    "Platostoma siamense", "Bulbophyllum hoehnei", "Aspidosperma polyneuron", 
    "Rhynchospora fascicularis", "Sida lonchitis", "Ardisia cymosa", "Morinda brachycalyx", 
    "Tetrastigma hypoglaucum", "Paphiopedilum vietnamense", "Goodenia glabra")

Then collect taxonomic hierarchies for each taxon, and remove those with no results (those with no results are just NA) (I'm setting verbose=TRUE to suppress messages for this example)

out <- classification(spnames, db = "ncbi", verbose = FALSE)
out <- out[!is.na(out)]

Use class2tree to automagically convert the list of hierarchies to a ape phylo object, then plot

tr <- class2tree(out)
plot(tr, no.margin = TRUE)

New functions: get_gbfid

The Global Biodiversity Information Facility (GBIF) has their own taxonomy. They allow programmatic access to their taxonomy, see here for details. Also see our rgbif package that wraps all their API services.

We added a similar function to our get_tsn, get_uid, etc. functions for various taxonomies, but for the GBIF taxonomy. Here are some example calls:

get_gbifid(sciname = "Poa annua", verbose = FALSE)

##         1 
## "2704179" 
## attr(,"class")
## [1] "gbifid"

get_gbifid(sciname = "Pinus contorta", verbose = FALSE)

##         1 
## "5285750" 
## attr(,"class")
## [1] "gbifid"

get_gbifid(sciname = "Puma concolor", verbose = FALSE)

##         1 
## "2435099" 
## attr(,"class")
## [1] "gbifid"

get_gbifid(c("Poa annua", "Pinus contorta"), verbose = FALSE)

## [1] "2704179" "5285750"
## attr(,"class")
## [1] "gbifid"

This could be useful if you for example, want to have the exact IDs GBIF uses for your set of species to use at some later point - and at that later point you could use our rgbif package and search for biodiversity occurrence data with the IDs you collected. For example:

library(rgbif)
(id <- get_gbifid(sciname = "Puma concolor", verbose = FALSE))

##         1 
## "2435099" 
## attr(,"class")
## [1] "gbifid"

occ_search(id)

## $meta
## $meta$offset
## [1] 0
## 
## $meta$limit
## [1] 20
## 
## $meta$endOfRecords
## [1] FALSE
## 
## $meta$count
## [1] 8392
## 
## 
## $hierarchy
## $hierarchy[[1]]
##            name     key    rank
## 1      Animalia       1 kingdom
## 2      Chordata      44  phylum
## 3      Mammalia     359   clazz
## 4     Carnivora     732   order
## 5       Felidae    9703  family
## 6          Puma 2435098   genus
## 7 Puma concolor 2435099 species
## 
## 
## $data
##             name       key longitude latitude
## 1  Puma concolor 866527350   -110.58    31.85
## 2  Puma concolor 866545169   -103.60    29.16
## 3  Puma concolor 866495627   -106.39    35.13
## 4  Puma concolor 866498665    -89.43    20.31
## 5  Puma concolor 866508658   -105.04    19.47
## 6  Puma concolor 866523280   -118.24    34.06
## 7  Puma concolor 866526517   -104.45    29.92
## 8  Puma concolor 866530535   -118.30    34.07
## 9  Puma concolor 860790696    -77.35     2.77
## 10 Puma concolor        NA        NA       NA
## 11 Puma concolor        NA        NA       NA
## 12 Puma concolor        NA        NA       NA
## 13 Puma concolor 866525528   -123.83    40.13
## 14 Puma concolor 866531329   -123.83    40.13
## 15 Puma concolor 866519497   -118.90    34.54
## 16 Puma concolor 866601452   -122.52    38.45
## 17 Puma concolor 866547065   -110.30    41.88
## 18 Puma concolor 866562541   -123.83    40.13
## 19 Puma concolor 866562081   -123.82    40.13
## 20 Puma concolor 866558112   -103.13    29.65

In addition, get_ids now accepts 'gbif' as an option for the db parameter - get_ids is our omnibus function to search for taxon ids across all sources available in taxize.

New functions: rbind and cbind for classification

The classification function gives back taxonomic hierarchies from a variety of sources, including NCBI, ITIS, Catalogue of Life, Tropicos, EOL, and now GBIF. If you pass in many taxonomic IDs or taxon names, you get back a list of hierarchies. We added two functions to make it convenient to mash these outputs together, rbind for basically stacking hierarchies on top of one another, and cbind for making a width-wise combination of hierarchies. Our cbind doesn't do exactly what your used to cbind doing for data.frame's. The examples below are based on some changed code since the newest CRAN version, but you can install the development version with the changes from Github (see here for instructions).

From a call to get_ids, then passed on to classification, we get a object of class classification_ids

(out <- get_ids(names = "Puma concolor", db = c("ncbi", "gbif"), verbose = FALSE))

## $ncbi
## Puma concolor 
##        "9696" 
## attr(,"match")
## [1] "found"
## attr(,"class")
## [1] "uid"
## 
## $gbif
## Puma concolor 
##     "2435099" 
## attr(,"class")
## [1] "gbifid"
## 
## attr(,"class")
## [1] "ids"

(cl <- classification(out, verbose = FALSE))

## $ncbi
## $`9696`
##                    name         rank
## 1    cellular organisms      no rank
## 2             Eukaryota superkingdom
## 3          Opisthokonta      no rank
## 4               Metazoa      kingdom
## 5             Eumetazoa      no rank
## 6             Bilateria      no rank
## 7         Deuterostomia      no rank
## 8              Chordata       phylum
## 9              Craniata    subphylum
## 10           Vertebrata      no rank
## 11        Gnathostomata   superclass
## 12           Teleostomi      no rank
## 13         Euteleostomi      no rank
## 14        Sarcopterygii      no rank
## 15 Dipnotetrapodomorpha      no rank
## 16            Tetrapoda      no rank
## 17              Amniota      no rank
## 18             Mammalia        class
## 19               Theria      no rank
## 20             Eutheria      no rank
## 21        Boreoeutheria      no rank
## 22       Laurasiatheria   superorder
## 23            Carnivora        order
## 24           Feliformia     suborder
## 25              Felidae       family
## 26              Felinae    subfamily
## 27                 Puma        genus
## 28        Puma concolor      species
## 
## attr(,"class")
## [1] "classification"
## attr(,"db")
## [1] "ncbi"
## 
## $gbif
## $`2435099`
##            name    rank
## 1      Animalia kingdom
## 2      Chordata  phylum
## 3      Mammalia   clazz
## 4     Carnivora   order
## 5       Felidae  family
## 6          Puma   genus
## 7 Puma concolor species
## 
## attr(,"class")
## [1] "classification"
## 
## attr(,"class")
## [1] "classification_ids"

We can bind width-wise

cbind(cl)

##              no rank superkingdom  kingdom   phylum subphylum
## 1 cellular organisms    Eukaryota  Metazoa Chordata  Craniata
## 2               <NA>         <NA> Animalia Chordata      <NA>
##      superclass    class     superorder     order   suborder  family
## 1 Gnathostomata Mammalia Laurasiatheria Carnivora Feliformia Felidae
## 2          <NA>     <NA>           <NA> Carnivora       <NA> Felidae
##   subfamily genus       species    clazz
## 1   Felinae  Puma Puma concolor     <NA>
## 2      <NA>  Puma Puma concolor Mammalia

Or bind length-wise

rbind(cl)

##    source taxonid                 name         rank
## 1    ncbi    9696   cellular organisms      no rank
## 2    ncbi    9696            Eukaryota superkingdom
## 3    ncbi    9696         Opisthokonta      no rank
## 4    ncbi    9696              Metazoa      kingdom
## 5    ncbi    9696            Eumetazoa      no rank
## 6    ncbi    9696            Bilateria      no rank
## 7    ncbi    9696        Deuterostomia      no rank
## 8    ncbi    9696             Chordata       phylum
## 9    ncbi    9696             Craniata    subphylum
## 10   ncbi    9696           Vertebrata      no rank
## 11   ncbi    9696        Gnathostomata   superclass
## 12   ncbi    9696           Teleostomi      no rank
## 13   ncbi    9696         Euteleostomi      no rank
## 14   ncbi    9696        Sarcopterygii      no rank
## 15   ncbi    9696 Dipnotetrapodomorpha      no rank
## 16   ncbi    9696            Tetrapoda      no rank
## 17   ncbi    9696              Amniota      no rank
## 18   ncbi    9696             Mammalia        class
## 19   ncbi    9696               Theria      no rank
## 20   ncbi    9696             Eutheria      no rank
## 21   ncbi    9696        Boreoeutheria      no rank
## 22   ncbi    9696       Laurasiatheria   superorder
## 23   ncbi    9696            Carnivora        order
## 24   ncbi    9696           Feliformia     suborder
## 25   ncbi    9696              Felidae       family
## 26   ncbi    9696              Felinae    subfamily
## 27   ncbi    9696                 Puma        genus
## 28   ncbi    9696        Puma concolor      species
## 29   gbif 2435099             Animalia      kingdom
## 30   gbif 2435099             Chordata       phylum
## 31   gbif 2435099             Mammalia        clazz
## 32   gbif 2435099            Carnivora        order
## 33   gbif 2435099              Felidae       family
## 34   gbif 2435099                 Puma        genus
## 35   gbif 2435099        Puma concolor      species

Or we can do the same thing on the class classification that we get back from a call to one of get_colid, get_tsn, get_eolid, get_tpsid, get_gbifid, or get_uid, that's then passed on to classification

cl_col <- classification(get_colid(c("Puma concolor", "Accipiter striatus"), 
    verbose = FALSE))
rbind(cl_col)

##    source  taxonid            name    rank
## 1     col  6862841        Animalia Kingdom
## 2     col  6862841        Chordata  Phylum
## 3     col  6862841        Mammalia   Class
## 4     col  6862841       Carnivora   Order
## 5     col  6862841         Felidae  Family
## 6     col  6862841            Puma   Genus
## 7     col 11909487        Animalia Kingdom
## 8     col 11909487        Chordata  Phylum
## 9     col 11909487            Aves   Class
## 10    col 11909487 Accipitriformes   Order
## 11    col 11909487    Accipitridae  Family
## 12    col 11909487       Accipiter   Genus

cbind(cl_col)

##    kingdom   phylum    class           order       family     genus
## 1 Animalia Chordata Mammalia       Carnivora      Felidae      Puma
## 2 Animalia Chordata     Aves Accipitriformes Accipitridae Accipiter

Read more about changes in v0.2 at Github.

To leave a comment for the author, please follow the link and comment on their blog: rOpenSci Blog - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)