Spatial Data Structures: from R to PostGIS and back to R

March 10, 2014
By

(This article was first published on Something must break, and kindly contributed to R-bloggers)

Last friday I had a chat on twitter with ElCep and Joel G about the possibility to send R’s spatial data structures to a PostGIS enabled database and retrieve them back to R.
Just right now, R can be used as a powerful GIS tool and the developement of spatial-oriented packages proceeds at a steady state .
On the other hand PostGIS is an unvaluable resource for storing and managing godzillions of vector and raster bits.

Coupling these two tools is like having a swiss knife for doing both statistics and GIS analyses when your datastests become bigger and bigger [1].

OK… after this cheesy introduction, let’s start from the beginning and suppose we have just a point location called ‘home’ in x= 1247289, y=5805658. We would like to generate a buffer of 1 km around the point using PostGIS and then get the resulting polygon back in R

Although the buffering operation can be done both completely in R and in PostGIS, this approach shows the trick to interoperate between the two software using the Well Known Text encoding of geometries, which is an OGC standard [2]. WKT is the “lengua franca” between R and PostGIS.

So, let’s get our hand dirty on it: first we push the point into R.

point = data.frame(x= 1247289, y= 5805658, name= 'home')

With the R package sp [3] we transform it in a spatialPointsDataFrame

require(sp)
coordinates(point) = c("lon","lat")
proj4string(point) = CRS("+init=epsg:3857")

With these operation we just trasform a dataframe in something different, a spatial structure, with a defined position on the earth surface using a Spherical Mercator Coordinates Reference System (CRS).
Now, to use this point inside PostGIS, we have to push it towards the database by the WKT encoding: to do so we’ll use the functions in the rgeos package [4].
If you look at the documentation rgeos is a powerful GIS tool allowing calculations of buffers, intersections… Nothing surprising  here: the topological library GEOS [5], on which rgeos is built on, is one of the pillars for the PostGIS engine and for almost every other FOSS GIS.

require(rgeos)
wktPoint = writeWKT(point)

So now we have our point geometry ready for PostGIS. The trick is to use the ST_GeomFromText PostGIS’ function that takes a WKT
as input and transform it in the internal encoding of the RDBMS. By the way, the encoding of our point in the DBMS is “0101000020110F000000000000390833410000008096255641”, and so we just build the query as:

query = paste0("SELECT ST_Buffer(ST_GeomFromText('", wktPoint ,"', 3857),1000)")

Nothing fancy in this spatial query: it just takes our point and builds a 1 km buffer around it. Pay attention to the double and single quotes: the WKT is a string of text and so it must be single quoted in the SQL.

If you try to fire the query from inside R (we will se how in a minute how to do it) you will get an unespected result:

RS-DBI driver warning: (unrecognized PostgreSQL field type geometry (id:16432) in column 0)

This puzzling message stands for: R does not understand the internal binary encoding of geometries of PostGIS and takes the result like a very long string of characters (“0103000020110F0000010000002100000…”). To bypass this problem, the resulting polygon (the buffer) must be transfored in a WKT wrapping the ST_AsText function around the ST_Buffer function. The query becomes:

query = paste0("SELECT ST_AsText(ST_Buffer(ST_GeomFromText('", wktPoint ,"', 3857),1000)) AS polygon")

To connect R to PostGIS we use the RPostgreSQl package [6] and customize the connection for your PostGIS enabled database:

require(RPostgreSQL)
connection = dbConnect(PostgreSQL(), dbname=myDB, host=myHost, user=myUser, password=my****)
buffer = dbGetQuery(connection, query)
dbDisconnect(connection)

Now we have a polygon correctly encoded in WKT. We can use the rgeos’ function readWKT to tranform it directly in a SpatialPolygons structure:

poly=readWKT(buffer)

Since WKT deos not include a spatial reference along with the geometry, we can add the Coordinate System using

proj4string(poly) = CRS("+init=epsg:3857")

…And that’s all, for now.
As said before, an operation like this could not be the best example of an optimized GIS analysis, but PostGIS + R could be another pretty tool in the pocket to work on spatial data.

To leave a comment for the author, please follow the link and comment on their blog: Something must break.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)