taxize changes

[This article was first published on rOpenSci Blog - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

We are building a taxonomic toolbelt for R called taxize – which gives you programmatic access to many sources of taxonomic data on the web. We just pushed a new version to CRAN (v0.1.5) with a lot of changes (see here for a rundown). Here are a few highlights of the changes.

Note: the windows binary may not be available yet…


Install and load taxize

install.packages<span class="p">(</span><span class="s">"taxize"</span><span class="p">)</span>
library<span class="p">(</span>taxize<span class="p">)</span>

Taxonomic identifiers

Each taxonomic service has their own unique ID for a taxon. We had a way to get ITIS and NCBI identifiers before – we now have functions for Tropicos, EOL, and the Catalogue of Life.

get_tpsid<span class="p">(</span>sciname <span class="o">=</span> <span class="s">"Helianthus petiolaris patens"</span><span class="p">)</span>
##        1 
## 50140022 
## attr(,"class")
## [1] "tpsid"
get_colid<span class="p">(</span>sciname <span class="o">=</span> <span class="s">"Pinus contorta"</span><span class="p">)</span>
##          1 
## "14376216" 
## attr(,"class")
## [1] "colid"

And we have a new function to get identifiers across all sources. And it's vectorized so you can pass in many names.

get_ids<span class="p">(</span>names <span class="o">=</span> c<span class="p">(</span><span class="s">"Chironomus riparius"</span><span class="p">,</span> <span class="s">"Poa annua"</span><span class="p">),</span> db <span class="o">=</span> c<span class="p">(</span><span class="s">"ncbi"</span><span class="p">,</span> <span class="s">"itis"</span><span class="p">))</span>
## $ncbi
## Chironomus riparius           Poa annua 
##            "315576"             "93036" 
## attr(,"match")
## [1] "found" "found"
## attr(,"class")
## [1] "uid"
## 
## $itis
## Chironomus riparius           Poa annua 
##            "129313"             "41107" 
## attr(,"match")
## [1] "found" "found"
## attr(,"class")
## [1] "tsn"
## 
## attr(,"class")
## [1] "ids"

Taxonomic classifications

We had a function called classification to get a taxonomic hierarchy from a particular source, ITIS or NCBI. We now have functions for Tropicos, EOL, and the Catalogue of Life. In addition, we have a function to get classifications across sources.

out <span class="o"><-</span> get_ids<span class="p">(</span>names <span class="o">=</span> <span class="s">"Poa annua"</span><span class="p">,</span> db <span class="o">=</span> c<span class="p">(</span><span class="s">"ncbi"</span><span class="p">,</span> <span class="s">"itis"</span><span class="p">))</span>
classification<span class="p">(</span>out<span class="p">)</span>
## $ncbi
## $ncbi$`Poa annua`
##        ScientificName         Rank    UID
## 1  cellular organisms      no rank 131567
## 2           Eukaryota superkingdom   2759
## 3       Viridiplantae      kingdom  33090
## 4        Streptophyta       phylum  35493
## 5      Streptophytina      no rank 131221
## 6         Embryophyta      no rank   3193
## 7        Tracheophyta      no rank  58023
## 8       Euphyllophyta      no rank  78536
## 9       Spermatophyta      no rank  58024
## 10      Magnoliophyta      no rank   3398
## 11         Liliopsida        class   4447
## 12        commelinids     subclass   4734
## 13             Poales        order  38820
## 14            Poaceae       family   4479
## 15          BEP clade      no rank 359160
## 16           Pooideae    subfamily 147368
## 17              Poeae        tribe 147387
## 18             Poinae     subtribe 640628
## 19                Poa        genus   4544
## 20          Poa annua      species  93036
## 
## 
## $itis
## $itis$`Poa annua`
##         rankName       taxonName    tsn
## 1        Kingdom         Plantae 202422
## 2     Subkingdom  Viridaeplantae 846492
## 3   Infrakingdom    Streptophyta 846494
## 4       Division    Tracheophyta 846496
## 5    Subdivision Spermatophytina 846504
## 6  Infradivision    Angiospermae 846505
## 7          Class   Magnoliopsida  18063
## 8     Superorder        Lilianae 846542
## 9          Order          Poales 846620
## 10        Family         Poaceae  40351
## 11         Genus             Poa  41074
## 12       Species       Poa annua  41107

Synonyms

Just like the classification function, synonyms now searches across sources for synonym names. However, this function only has methods for ITIS and Tropicos.

synonyms<span class="p">(</span>c<span class="p">(</span><span class="s">"Poa annua"</span><span class="p">,</span> <span class="s">"Pinus contorta"</span><span class="p">,</span> <span class="s">"Puma concolor"</span><span class="p">),</span> db <span class="o">=</span> <span class="s">"itis"</span><span class="p">)</span>
## $`Poa annua`
##                           name    tsn
## 1      Poa annua var. aquatica 538978
## 2       Poa annua var. reptans 538979
## 3                  Aira pumila 785854
## 4             Catabrosa pumila 787993
## 5               Ochlopoa annua 791574
## 6               Poa aestivalis 793946
## 7                   Poa algida 793954
## 8         Poa annua var. annua 802116
## 9     Poa annua var. eriolepis 802117
## 10 Poa annua var. rigidiuscula 802119
## 11        Poa annua f. reptans 803667
## 
## $`Pinus contorta`
##      name    tsn
## 1 nomatch 183327
## 
## $`Puma concolor`
##             name    tsn
## 1 Felis concolor 180587

Common names from scientific names and vice versa

We have a variety of ways to get common names from scientific names for different sources. To simply things, there are two new functions to work across sources.

sci2comm<span class="p">(</span>scinames <span class="o">=</span> <span class="s">"Helianthus annuus"</span><span class="p">,</span> db <span class="o">=</span> <span class="s">"itis"</span><span class="p">)</span>
## $`Helianthus annuus`
##            comname    lang   tsn
## 1 common sunflower English 36616
## 2        sunflower English 36616
## 3   wild sunflower English 36616
## 4 annual sunflower English 36616

And vice versa

comm2sci<span class="p">(</span>commnames <span class="o">=</span> <span class="s">"annual blue grass"</span><span class="p">,</span> db <span class="o">=</span> <span class="s">"tropicos"</span><span class="p">)</span>
## $`annual blue grass`
##     NameId ScientificName ScientificNameWithAuthors  Family
## 1 25509881      Poa annua              Poa annua L. Poaceae
##   RankAbbreviation NomenclatureStatusID NomenclatureStatusName Symbol
## 1              sp.                    1             Legitimate      !
##   Author DisplayReference DisplayDate TotalRows
## 1     L.    Sp. Pl. 1: 68        1753         1

ITIS functions

We reworked the functions that interact with the ITIS service. We still have all the same functions, but there are now a number of higher level functions that make the interface to ITIS simpler. Here are some examples:

Get references associated with names.

itis_refs<span class="p">(</span>c<span class="p">(</span><span class="m">202385</span><span class="p">,</span> <span class="m">70340</span><span class="p">))</span>
## [[1]]
##   actualPubDate          isbn issn listedPubDate        pages
## 1    2005-10-01 0-8018-8221-4         2005-10-01         2142
## 2    1993-01-01 1-56098-217-9         1992-01-01 xviii + 1207
## 3    1981-04-03                       1981-04-03         1181
##                                      pubComment
## 1                                              
## 2 Corrections were made to text at 3rd printing
## 3                                              
##                                                                                    pubName
## 1  Mammal Species of the World: A Taxonomic and Geographic Reference, 3rd ed., vols. 1 & 2
## 2 Mammal Species of the World: A Taxonomic and Geographic Reference, 2nd ed., 3rd printing
## 3                                    The Mammals of North America, vols. I and II, 2nd ed.
##                   pubPlace                      publisher
## 1 Baltimore, Maryland, USA Johns Hopkins University Press
## 2      Washington, DC, USA  Smithsonian Institution Press
## 3  New York, New York, USA      John Wiley and Sons, Inc.
##                              referenceAuthor                    name
## 1 Wilson, Don E., and DeeAnn M. Reeder, eds. Ursus arctos horribilis
## 2 Wilson, Don E., and DeeAnn M. Reeder, eds. Ursus arctos horribilis
## 3                           Hall, E. Raymond Ursus arctos horribilis
##   refLanguage referredTsn title updateDate
## 1                  202385       2013-11-04
## 2                  202385       2013-11-04
## 3                  202385       2013-11-04
## 
## [[2]]
##   actualPubDate          isbn      issn listedPubDate     pages pubComment
## 1    1988-01-01 0-913235-47-4 0097-0638    1988-01-01 vii + 277           
## 2    1998-01-01 1-888569-01-8 0097-0638    1998-01-01       526           
##                                             pubName
## 1 American Fisheries Society Special Publication 16
## 2 American Fisheries Society Special Publication 26
##                  pubPlace                  publisher
## 1 Bethesda, Maryland, USA American Fisheries Society
## 2 Bethesda, Maryland, USA American Fisheries Society
##                                                                                  referenceAuthor
## 1          Turgeon, D. D., A. E. Bogan, E. V. Coan, W. K. Emerson, W. G. Lyons, W. Pratt, et al.
## 2 Turgeon, D. D., J. F. Quinn, Jr., A. E. Bogan, E. V. Coan, F. G. Hochberg, W. G. Lyons, et al.
##                 name refLanguage referredTsn
## 1 Tulotoma magnifica                   70340
## 2 Tulotoma magnifica                   70340
##                                                                                                       title
## 1          Common and scientific names of aquatic invertebrates from the United States and Canada: mollusks
## 2 Common and scientific names of aquatic invertebrates from the United States and Canada: Mollusks, 2nd ed.
##   updateDate
## 1 2013-04-26
## 2 2013-11-04

Get hierarchy: full, up (immediate), and down (immediate)

itis_hierarchy<span class="p">(</span>tsn <span class="o">=</span> <span class="m">180543</span><span class="p">)</span>
##       parentName parentTsn     rankName                 taxonName    tsn
## 1                               Kingdom                  Animalia 202423
## 2       Animalia    202423   Subkingdom                 Bilateria 914154
## 3      Bilateria    914154 Infrakingdom             Deuterostomia 914156
## 4  Deuterostomia    914156       Phylum                  Chordata 158852
## 5       Chordata    158852    Subphylum                Vertebrata 331030
## 6     Vertebrata    331030  Infraphylum             Gnathostomata 914179
## 7  Gnathostomata    914179   Superclass                 Tetrapoda 914181
## 8      Tetrapoda    914181        Class                  Mammalia 179913
## 9       Mammalia    179913     Subclass                    Theria 179916
## 10        Theria    179916   Infraclass                  Eutheria 179925
## 11      Eutheria    179925        Order                 Carnivora 180539
## 12     Carnivora    180539     Suborder                Caniformia 552303
## 13    Caniformia    552303       Family                   Ursidae 180540
## 14       Ursidae    180540        Genus                     Ursus 180541
## 15         Ursus    180541      Species              Ursus arctos 180543
## 16  Ursus arctos    180543   Subspecies   Ursus arctos alascensis 726985
## 17  Ursus arctos    180543   Subspecies       Ursus arctos arctos 202383
## 18  Ursus arctos    180543   Subspecies  Ursus arctos beringianus 726986
## 19  Ursus arctos    180543   Subspecies Ursus arctos californicus 726987
## 20  Ursus arctos    180543   Subspecies     Ursus arctos collaris 726988
## 21  Ursus arctos    180543   Subspecies    Ursus arctos crowtheri 726989
## 22  Ursus arctos    180543   Subspecies        Ursus arctos dalli 726990
## 23  Ursus arctos    180543   Subspecies         Ursus arctos gyas 726991
## 24  Ursus arctos    180543   Subspecies   Ursus arctos horribilis 202385
## 25  Ursus arctos    180543   Subspecies  Ursus arctos isabellinus 622068
## 26  Ursus arctos    180543   Subspecies     Ursus arctos lasiotus 726992
## 27  Ursus arctos    180543   Subspecies Ursus arctos middendorffi 726993
## 28  Ursus arctos    180543   Subspecies    Ursus arctos pruinosus 202386
## 29  Ursus arctos    180543   Subspecies    Ursus arctos sitkensis 726994
## 30  Ursus arctos    180543   Subspecies Ursus arctos stikeenensis 726995
## 31  Ursus arctos    180543   Subspecies     Ursus arctos syriacus 726996
itis_hierarchy<span class="p">(</span>tsn <span class="o">=</span> <span class="m">180543</span><span class="p">,</span> what <span class="o">=</span> <span class="s">"up"</span><span class="p">)</span>
##   parentName parentTsn rankName    taxonName    tsn
## 1      Ursus    180541  Species Ursus arctos 180543
itis_hierarchy<span class="p">(</span>tsn <span class="o">=</span> <span class="m">180543</span><span class="p">,</span> what <span class="o">=</span> <span class="s">"down"</span><span class="p">)</span>
##      parentName parentTsn   rankName                 taxonName    tsn
## 1  Ursus arctos    180543 Subspecies       Ursus arctos arctos 202383
## 2  Ursus arctos    180543 Subspecies   Ursus arctos horribilis 202385
## 3  Ursus arctos    180543 Subspecies    Ursus arctos pruinosus 202386
## 4  Ursus arctos    180543 Subspecies  Ursus arctos isabellinus 622068
## 5  Ursus arctos    180543 Subspecies   Ursus arctos alascensis 726985
## 6  Ursus arctos    180543 Subspecies  Ursus arctos beringianus 726986
## 7  Ursus arctos    180543 Subspecies Ursus arctos californicus 726987
## 8  Ursus arctos    180543 Subspecies     Ursus arctos collaris 726988
## 9  Ursus arctos    180543 Subspecies    Ursus arctos crowtheri 726989
## 10 Ursus arctos    180543 Subspecies        Ursus arctos dalli 726990
## 11 Ursus arctos    180543 Subspecies         Ursus arctos gyas 726991
## 12 Ursus arctos    180543 Subspecies     Ursus arctos lasiotus 726992
## 13 Ursus arctos    180543 Subspecies Ursus arctos middendorffi 726993
## 14 Ursus arctos    180543 Subspecies    Ursus arctos sitkensis 726994
## 15 Ursus arctos    180543 Subspecies Ursus arctos stikeenensis 726995
## 16 Ursus arctos    180543 Subspecies     Ursus arctos syriacus 726996

To leave a comment for the author, please follow the link and comment on their blog: rOpenSci Blog - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)