# Three ways to calculate distances in R

**Bluecology blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Three ways to calculate distances in R

Calculating a distance on a map sounds straightforward, but it can be

confusing how many different ways there are to do this in R.

This complexity arises because there are different ways of defining

‘distance’ on the Earth’s surface.

The Earth is spherical. So do you want to calculate distances around the

sphere (‘great circle distances’) or distances on a map (‘Euclidean

distances’).

Then there are barriers. For example, for distances in the ocean, we

often want to know the nearest distance around islands.

Then there is the added complexity of the different spatial data types.

Here we will just look at points, but these same concepts apply to other

data types, like shapes.

## Example data

Let’s look at some example data. It is just a series of points across

the island of Tasmania. We are going to calculate how far apart these

points are from each other.

We’ll use `sf`

for spatial data and `tmap`

for mapping.

Here’s a map:

tm_shape(stas) + tm_polygons() + tm_graticules(col = "grey60") + tm_shape(pts) + tm_symbols(col = "black") + tm_scale_bar(position = c("left", "bottom")) + tm_shape(pts) + tm_text("pt", ymod = -1)

Note I’ve included a scale bar, but of course the distance between

longitude lines gets closer at higher latitudes.

## Great circle distances

The first method is to calculate great circle distances, that account

for the curvature of the earth. If we use `st_distance()`

with

unprojected coordinates (ie in lon-lat) then we get great circle

distances (in metres).

m <- st_distance(pts) m/1000 ## Units: [m] ## [,1] [,2] [,3] ## [1,] 0.000 821.5470 1200.7406 ## [2,] 821.547 0.0000 419.5004 ## [3,] 1200.741 419.5004 0.0000

The matrix m gives the distances between points (we divided by 1000 to

get distances in KM).

## Euclidean distances

Another option is to first project the points to a projection that

preserves distances and then calculate the distances. This option is

computationally faster, but can be less accurate, as we will see.

We will use the local UTM projection. So you can see what this looks

like, we will project the land too.

tas_utm <- st_crs("+proj=utm +zone=55 +datum=WGS84 +units=m +no_defs") stas2 <- st_transform(stas, crs = tas_utm) pts2 <- st_transform(pts, crs = tas_utm) tm_shape(stas2) + tm_polygons() + tm_graticules(col = "grey60") + tm_shape(pts2) + tm_symbols(col = "black") + tm_scale_bar(position = c("left", "bottom")) + tm_shape(pts) + tm_text("pt", ymod = -1)

Note how it now bends the lat/long lines. This happens because we are

projecting a sphere onto a flat surface. The UTM will be most accurate

at the centre of its zone (we used Zone 55 which is approximately

centred on Tasmania).

If we were interested in mapping the mainland of Australia accurately,

we’d use a different UTM zone.

Now we can calculate Euclidean distances:

m2 <- st_distance(pts2) m2/1000 ## Units: [m] ## [,1] [,2] [,3] ## [1,] 0.0000 824.8996 1203.6228 ## [2,] 824.8996 0.0000 419.4163 ## [3,] 1203.6228 419.4163 0.0000

Compare these to our great circle distances:

m/1000 ## Units: [m] ## [,1] [,2] [,3] ## [1,] 0.000 821.5470 1200.7406 ## [2,] 821.547 0.0000 419.5004 ## [3,] 1200.741 419.5004 0.0000

Note the slight differences, particularly between point 1 and the other

points. The first method (great circle) is the more accurate one, but is

also a bit slower. The Euclidean distances become a bit inaccurate for

point 1, because it is so far outside the zone of the UTM projection.

Points 2 & 3 are within the UTM zone, so the distance between these

points is almost identical to the great circle calculation.

## Distances around a barrier

The basic idea here is that we turn the data into a raster grid and then

use the `gridDistance()`

function to calculate distances around barriers

(land) between points.

So first we need to rasterize the land. The package `fasterize`

has a

fast way to turn sf polygons into land:

library(fasterize) library(raster) library(dplyr) r <- raster(extent(stas2), nrows = 50, ncols = 50) rtas <- fasterize(summarize(stas2), r)

I made the raster pretty blocky (50 x 50). You could increase the

resolution to improve the accuracy of the distance measurements. Here’s

how it looks:

Now we need to identify the raster cell’s where the points fall. We do

this by extracting coordinates from `pts2`

and asking for their unique

raster cell numbers:

rtas_pts <- rtas xy <- st_coordinates(pts2) icell <- cellFromXY(rtas, xy)

Now, we set the cells of our raster corresponding to the points to a

different number than the rest. I will just use the 3rd point (if we

used all points then we get nearest distance around barriers to any

point).

rtas_pts[icell[3]] <- 2

This will look like the same raster, but with a spot where the 3rd point

fell (note red box):

Now just run `gridDistance`

telling it to calculate distances from the

cells with a value of `2`

(just one cell in this case) and omit values

of `1`

(land) when doing the distances:

d <- gridDistance(rtas_pts, origin = 2, omit = 1)/1000

This will be slow for larger rasters (or very high res). Let’s see how

it looks:

Colours correspond to distances from point 3 (the location we gave a value of ‘2’ to in the raster).

Now we can just ask for the distance values at the cells of the other

points:

d[icell] ## [1] 1310.5141 612.1404 0.0000

So 612 km around Tasmania from point 3 to 2, as the dolphin swims. It

was only 419 km if we could fly straight over Tasmania:

m[2,3]/1000 ## 419.5004 [m]

(note is says metres, but that is because R hasn’t remembered we’ve

divided by 1000)

**leave a comment**for the author, please follow the link and comment on their blog:

**Bluecology blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.