Site icon R-bloggers

The rOpenSci geospatial suite

[This article was first published on rOpenSci Blog - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Geospatial data – data embedded in a spatial context – is used across disciplines, whether it be history, biology, business, tech, public health, etc. Along with community contributors, we're working on a suite of tools to make working with spatial data in R as easy as possible.

If you're not familiar with geospatial tools, it's helpful to see what people do with them in the real world.

Example 1

One of our geospatial packages, geonames, is used for geocoding, the practice of either sorting out place names from geographic data, or vice versa. geonames interfaces with the open database of the same name: http://www.geonames.org/. A recent paper in PlosONE highlights a common use case. Harsch & HilleRisLambers1 asked how plant species distributions have shifted due to climate warming. They used the GNsrtm3() function in geonames, which uses Shuttle Radar Topography Mission elevation data, to fill in missing or incorrect elevation values in their dataset.

Example 2

Another of our packages, geojsonio, is used as a tool to ingest GeoJSON, or make GeoJSON from various inputs. geojsonio was used in Frankfurt's Open Data Hackathon in March 2016 in a project to present users with random Google Streetview Images of Frankfurt. Check out the repo at safferli/opendataday2016.


We covered the state of our geospatial tools in March of this year, but a lot has changed since then so we thought it would be useful to do an overview of these tools and future work.

There are many geospatial data formats, including shapefiles, GeoTIFF, netCDF, Well-known text/Well-known binary, GeoJSON, and many more. Readers may be more familiar with shape files than WKT or GeoJSON. There are R tools for shape files, so our tools largely don't concern themselves with shape files and other geospatial data formats. Two formats in particular that we create tools for are GeoJSON and WKT.

GeoJSON

With the explosion of Javascript/Node and web-first tools, and increasing dominance of JSON as a data format, GeoJSON as a spatial data format has seen increasing use. GeoJSON is a lightweight format based on JSON, and has a very new standard specification: RFC 7946. Many of our geospatial tools center around GeoJSON. Our goal with GeoJSON focused tools is to create a pipeline in which users can process GeoJSON data without any headaches due to dependencies.

WKT

Well-known text is a plain text format, just like GeoJSON (WKB is a binary form of WKT). It is often used in SQL databases to store geospatial data. Many of the data sources our R packages work with, for example http://www.gbif.org/ (see our package rgbif), use WKT to specify geospatial extent. Thus, rgbif shouldn't need to import an entire spatial stack that is hard for some to install only for dealing with a single spatial data format – and only some users will do geospatial queries wih WKT as you can constrain queries simply with country names, while others may not need to constrain spatially. We've been working on tools to make dealing with WKT more lightweight.


rOpenSci use cases

The motivation for our geospatial tools is supported in part by these use cases for software we make that use WKT and GeoJSON:


Tools

rOpenSci has a growing suite of database tools:

GeoJSON/WKT Tools

Data/Data Services




For each package below, there are 2-3 badges. One for whether the package is on CRAN or not (cran if on CRAN, cran if not), another for link to source on GitHub (github), and another when the package is community contributed (community).

geojson

cran github

We're excited to announce a new package geojson, which is now on CRAN. Check out the vignettes (geojson classes, geojson operations) to get started.

You can install the package from CRAN:

install.packages("geojson")

library("geojson")

The geojson package has functions for creating each of the GeoJSON classes from character strings of GeoJSON.

Internally, we perform some basic checks that the string is proper JSON, then if you want to lint the GeoJSON (see the linting_opts() function) we'll lint the GeoJSON as well using our geojsonlint package.

Make a Point

(x <- point('{ "type": "Point", "coordinates": [100.0, 0.0] }'))
#> <Point>
#>   coordinates:  [100,0]

In addition, you can perform some basic operations, such as adding (properties_add()) or getting properties (properties_get()), adding (crs_add()) or getting CRS (crs_get()), adding (bbox_add()) or getting (bbox_get()) a bounding box. You can calculate a bounding box on your GeoJSON with geo_bbox(), prettify your GeoJSON with geo_pretty(), and write your GeoJSON to disk with geo_write().

Add and get properties

(y <- linestring('{ "type": "LineString", "coordinates": [ [100.0, 0.0], [101.0, 1.0] ]}'))
#> <LineString>
#>   coordinates:  [[100,0],[101,1]]
(z <- y %>% feature() %>% properties_add(population = 1000))
#> {
#>     "type": "Feature",
#>     "properties": {
#>         "population": 1000
#>     },
#>     "geometry": {
#>         "type": "LineString",
#>         "coordinates": [
#>             [
#>                 100,
#>                 0
#>             ],
#>             [
#>                 101,
#>                 1
#>             ]
#>         ]
#>     }
#> }
properties_get(z, property = 'population')
#> 1000

Add bbox – without an input, we figure out the 2D bbox for you

x <- '{ "type": "Polygon",
"coordinates": [
  [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ]
  ]
}'
y <- polygon(x)
y %>% feature() %>% bbox_add()
#> {
#>     "type": "Feature",
#>     "properties": {
#>
#>     },
#>     "geometry": {
#>         "type": "Polygon",
#>         "coordinates": [
#>             [
#>                 [
#>                     100,
#>                     0
#>                 ],
#>                 [
#>                     101,
#>                     0
#>                 ],
#>                 [
#>                     101,
#>                     1
#>                 ],
#>                 [
#>                     100,
#>                     1
#>                 ],
#>                 [
#>                     100,
#>                     0
#>                 ]
#>             ]
#>         ]
#>     },
#>     "bbox": [
#>         100,
#>         0,
#>         101,
#>         1
#>     ]
#> }

Get the GeoJSON type

geo_type(y)
#> [1] "Polygon"

Pretty print the GeoJSON

geo_pretty(y)
#> {
#>     "type": "Polygon",
#>     "coordinates": [
#>         [
#>             [
#>                 100.0,
#>                 0.0
#>             ],
#>             [
#>                 101.0,
#>                 0.0
#>             ],
#>             [
#>                 101.0,
#>                 1.0
#>             ],
#>             [
#>                 100.0,
#>                 1.0
#>             ],
#>             [
#>                 100.0,
#>                 0.0
#>             ]
#>         ]
#>     ]
#> }
#>

Write to disk (and read back)

f <- tempfile(fileext = ".geojson")
geo_write(y, f)
jsonlite::fromJSON(f, FALSE)
#> $type
#> [1] "Polygon"
#>
#> $coordinates
#> $coordinates[[1]]
#> $coordinates[[1]][[1]]
#> $coordinates[[1]][[1]][[1]]
#> [1] 100
#>
#> $coordinates[[1]][[1]][[2]]
#> [1] 0
#>
#>
#> $coordinates[[1]][[2]]
#> $coordinates[[1]][[2]][[1]]
#> [1] 101
#>
#> $coordinates[[1]][[2]][[2]]
#> [1] 0
#>
#>
#> $coordinates[[1]][[3]]
#> $coordinates[[1]][[3]][[1]]
#> [1] 101
#>
#> $coordinates[[1]][[3]][[2]]
#> [1] 1
#>
#>
#> $coordinates[[1]][[4]]
#> $coordinates[[1]][[4]][[1]]
#> [1] 100
#>
#> $coordinates[[1]][[4]][[2]]
#> [1] 1
#>
#>
#> $coordinates[[1]][[5]]
#> $coordinates[[1]][[5]][[1]]
#> [1] 100
#>
#> $coordinates[[1]][[5]][[2]]
#> [1] 0

Lastly, the Mapbox folks have a compact binary encoding for geographic data (Geobuf) that provides lossless compression of GeoJSON data into protocol buffers. Our own Jeroen Ooms added Geobuf serialization to his protolite package, which we import in geojson to allow you to read Geobuf with from_geobuf() and write Geobuf with to_geobuf().

file <- system.file("examples/test.pb", package = "geojson")
from_geobuf(file, pretty = TRUE)
#> {
#>   "type": "FeatureCollection",
#>   "features": [
#>     {
#>       "type": "Feature",
#>       "geometry": {
#>         "type": "Point",
#>         "coordinates": [102, 0.5]
#>       },
#>       "id": 999,
...

to_geobuf(from_geobuf(file))
#>   [1] 0a 05 70 72 6f 70 30 0a 06 64 6f 75 62 6c 65 0a 0c 6e 65 67 61 74 69
#>  [24] 76 65 5f 69 6e 74 0a 0c 70 6f 73 69 74 69 76 65 5f 69 6e 74 0a 0f 6e
#>  [47] 65 67 61 74 69 76 65 5f 64 6f 75 62 6c 65 0a 0f 70 6f 73 69 74 69 76
#>  [70] 65 5f 64 6f 75 62 6c 65 0a 04 6e 75 6c 6c 0a 05 61 72 72 61 79 0a 06
#>  [93] 6f 62 6a 65 63 74 0a 06 62 6c 61 62 6c 61 0a 07 63 75 73 74 6f 6d 31
...

geoops

cran github

geoopsgeoops is not quite ready to use yet, but the goal with geoops is to provide spatial operations on GeoJSON that work with the geojson package. Example operations are:

Another feature of geoops we're excited about is slicing up GeoJSON easily by using our package jqr. It's similar in concept to using dplyr for drilling down into a data.frame, but instead we can do that with GeoJSON.

note: this package used to be called siftgeojson

geojsonio

cran github

geojsoniogeojsonio is a client for making it easy to convert lots of different things to GeoJSON, and for reading/writing GeoJSON.

We had a new version (v0.2) come out in July this year, with major performance improvements to geojson_json() – and we've deprecated GeoJSON linting functionality and now point people to our package geojsonlint for all GeoJSON linting tasks.

Example

A quick example of the power of geojsonio

install.packages("geojsonio")

library("geojsonio")

Convert a numeric vector to a GeoJSON Point:

geojson_json(c(32.45, -99.74))
#> {"type":"FeatureCollection","features":[{"type":"Feature","geometry":{"type":"Point","coordinates":[32.45,-99.74]},"properties":{}}]}

Read GeoJSON from a file with one simple command and plot it:

file <- system.file("examples", "california.geojson", package = "geojsonio")
out <- geojson_read(file, what = "sp")
library('sp')
plot(out)

geojsonlint

cran github

geojsonlintgeojsonlint is a client for linting GeoJSON. It provides three different ways to lint GeoJSON, using: the API at < geojsonlint.com>, the JS library geojsonhint, or the JS library is-my-json-valid. The package provides a consistent interface to the three different linters, always returning a boolean, and toggles provided for verbose output and whether to stop when invalid GeoJSON is found.

We released a new version (v0.2) this month that uses the newer version of the JS library geojsonhint, affecting the geojson_hint() function. Note that the dev version of geojsonlint has an even newer version of the JS geojsonhint library, so you may want to upgrade if you're using that linter: devtools::install_github("ropensci/geojsonlint").

Example

A quick example of the power of geojsonlint

install.packages("geojsonlint")

library("geojsonlint")

Good GeoJSON

geojson_hint(x = '{"type": "Point", "coordinates": [-100, 80]}')
#> [1] TRUE

Bad GeoJSON

geojson_hint('{ "type": "FeatureCollection" }')
#> [1] FALSE
geojson_hint('{ "type": "FeatureCollection" }', verbose = TRUE)
#> [1] FALSE
#> attr(,"errors")
#>   line                    message
#> 1    1 "features" member required
geojson_hint('{ "type": "FeatureCollection" }', error = TRUE)
#> Error: Line 1
#>        - "features" member required

lawn

cran github

lawnlawn is an R client wrapping Turf.js from Mapbox. Turf is a JS library for doing advanced geospatial analysis. Using the great V8 R client from Jeroen Ooms we can wrap Turf.js in R.

We released a new version (v0.3) late last month that is a big change from the previous version as we now wrap the newest version of Turf v3.5.2 that dropped a number of methods, and introduced new ones.

Example

A quick example of the power of lawn

install.packages("lawn")

library("lawn")

Calcuate distance (default: km) between two points

from <- '{
 "type": "Feature",
 "properties": {},
 "geometry": {
   "type": "Point",
   "coordinates": [-75.343, 39.984]
 }
}'
to <- '{
  "type": "Feature",
  "properties": {},
  "geometry": {
    "type": "Point",
    "coordinates": [-75.534, 39.123]
  }
}'
lawn_distance(from, to)
#> [1] 97.15958

Buffer a point (with distance of 5 km)

pt <- '{
 "type": "Feature",
 "properties": {},
 "geometry": {
    "type": "Point",
    "coordinates": [-90.548630, 14.616599]
  }
}'
lawn_buffer(pt, dist = 5)
#> <Feature>
#>   Type: Polygon
#>   Bounding box: -90.6 14.6 -90.5 14.7
#>   No. points: 66
#>   Properties: NULL

geonames

cran github community

geonamesgeonames is an R client for the < geonames.org> web service, that allows you to query for global geographic data such as administrative areas, populated places, and more.

Example

A quick example of the power of geonames

install.packages("geonames")

library("geonames")

Search for place names with by place name:

GNsearch(q = 'london', maxRows = 10)
#>    adminCode1       lng geonameId             toponymName countryId fcl
#> 1         ENG  -0.12574   2643743                  London   2635167   P
#> 2          08 -81.23304   6058560                  London   6251999   P
#> 3         ENG  -0.09184   2643741          City of London   2635167   P
#> 4          05  27.91162   1006984             East London    953987   P
...

Find the ISO country code for a given lat/long:

GNcountryCode(lat = 47.03, lng = 10.2)
#> $languages
#> [1] "de-AT,hr,hu,sl"
#>
#> $distance
#> [1] "0"
#>
#> $countryCode
#> [1] "AT"
#>
#> $countryName
#> [1] "Republic of Austria"

geoparser

cran github community

geoparsergeoparser is an interface to the Geoparser.io API for Identifying and Disambiguating Places Mentioned in Text

Example

A quick example of the power of geoparser

install.packages("geoparser")

library("geoparser")

In a very simple example, send a small string to get geoparsed:

output <- geoparser_q("I was born in Vannes and I live in Barcelona")
output$results
#> # A tibble: 2 × 11
#>   country confidence      name admin1                                           type geometry.type longitude latitude reference1 reference2
#> *   <chr>     <fctr>     <chr>  <chr>                                          <chr>        <fctr>     <dbl>    <dbl>      <dbl>      <dbl>
#> 1      FR          1    Vannes     A2 seat of a second-order administrative division         Point  -2.75000 47.66667         14         20
#> 2      ES          1 Barcelona     56  seat of a first-order administrative division         Point   2.15899 41.38879         35         44
#> # ... with 1 more variables: text_md5 <chr>
output$properties
#> # A tibble: 1 × 4
#>   apiVersion       source                    id                         text_md5
#> *     <fctr>       <fctr>                <fctr>                            <chr>
#> 1      0.4.1 geoparser.io 7Mp287nh6XbbH0QMojB6L 51e05aeb3366e55795a9729dd74ae901

The properties data.frame gives some metadata for your request, while the results data.frame gives results of the geoparsing, including the name found, the type of geospatial thing, geometry type, and coordinates.

rgeospatialquality

cran github community

rgeospatialqualityrgeospatialquality is an R client for the Geospatial Data Quality API that detects geospatial quality issues with geostpaial biodiversity occurrence data.

Example

A quick example of the power of rgeospatialquality

install.packages("rgeospatialquality")

library("rgeospatialquality")

Make a simple occurrence record:

rec <- list(
  decimalLatitude = 42.1833,
  decimalLongitude = -1.8332,
  countryCode = "ES",
  scientificName = "Puma concolor"
)

Pass the record to the API:

parse_record(record = rec)
#> $hasCoordinates
#> [1] TRUE
#>
#> $validCountry
#> [1] TRUE
#>
#> $validCoordinates
#> [1] TRUE
#>
#> $hasCountry
#> [1] TRUE
#>
#> $coordinatesInsideCountry
#> [1] TRUE
#>
#> $hasScientificName
#> [1] TRUE
#>
#> $highPrecisionCoordinates
#> [1] TRUE
#>
#> $coordinatesInsideRangeMap
#> [1] FALSE
#>
#> $nonZeroCoordinates
#> [1] TRUE
#>
#> $distanceToRangeMapInKm
#> [1] 6874.023

The results is a named list, with results for various aspects of geospatial quality, including whether the record has coordinates, whether the country is valid, whether the coordinates are valid, and more.

getlandsat

cran github

getlandsatgetlandsat provides access to Landsat https://landsat.usgs.gov 8 metadata and images hosted on AWS S3 at https://aws.amazon.com/public-data-sets/landsat. The package only fetches data. It does not attempt to aid users in downstream usage.

Example

A quick example of the power of getlandsat

install.packages("getlandsat")

library("getlandsat")

Get an image (see lsat_list(), lsat_scenes(), lsat_scene_files() to find/search for images):

lsat_image("LC80101172015002LGN00_B5.TIF")
#> [1] "/Users/sacmac/Library/Caches/landsat-pds/L8/010/117/LC80101172015002LGN00/LC80101172015002LGN00_B5.TIF"

Make a plot

library("raster")
x <- lsat_cache_details()[[1]]
img <- raster(x$file)
plot(img)

osmplotr

cran github community

osmplotrosmplotr produces customisable images of OpenStreetMap (OSM) data and enables data visualisation using OSM objects.

Example

A quick example of the power of osmplotr

install.packages("osmplotr")

library("osmplotr")
library("maptools")

Make a basic map:

bbox <- get_bbox(latlon = c(-0.13,51.50,-0.11,51.52))
dat_B <- extract_osm_objects(key = 'building', bbox = bbox)
map <- osm_basemap(bbox = bbox, bg = 'gray20')
map <- add_osm_objects(map, dat_B, col = 'gray40')
print_osm_map(map)

rnaturalearth

cran github community

rnaturalearthrnaturalearth facilitates world mapping by making Natural Earth (http://www.naturalearthdata.com/) map data available in R.

Example

A quick example of the power of rnaturalearth

devtools::install_github("ropenscilabs/rnaturalearth")

library("rnaturalearth")

Some data comes installed with rnaturalearth

sp::plot(ne_countries())

You can also download specific data:

rivers50 <- ne_download(scale = 50, type = 'rivers_lake_centerlines',
                        category = 'physical')
#> OGR data source with driver: ESRI Shapefile
#> Source: "/var/folders/gs/4khph0xs0436gmd2gdnwsg080000gn/T//Rtmp8yl7Kj", layer: "ne_50m_rivers_lake_centerlines"
#> with 460 features
#> It has 4 fields
sp::plot(rivers50)


Future/Ongoing work

We still have a lot that we'd like to do. Here's a run down of some of the items on our list:


Takeaway and Feedback

Our goal with our geospatial suite is to make your work — whether it be serious reproducible science, analysis of your company's data, or just fooling around with some data — as easy as possible with as few installation headaches as possible.

How are you using our geospatial tools? We'd love to hear about how you're using our packages, whether it be in blog posts, scholarly papers, shiny apps, business use cases, etc.

Let us know if you have any feedback on these packages, and/or if you think there's anything else we should be thinking about making in this space.


Community contributors

We're so so grateful to our community for their hard work on these packages:

< !-- links --> < !-- references -->

  1. Harsch, M. A., & HilleRisLambers, J. (2016). Climate Warming and Seasonal Precipitation Change Interact to Limit Species Distribution Shifts across Western North America. PLoS ONE, 11(7), e0159184. https://doi.org/10.1371/journal.pone.0159184 

To leave a comment for the author, please follow the link and comment on their blog: rOpenSci Blog - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.