**R – scottishsnow**, and kindly contributed to R-bloggers)

Last year I spent a bit of time learning about routing and network analysis. I created a map of distance from each GB postcode to the nearest railway station. At the time my local council were also looking at changing school catchments and building new schools. This also seemed like an excellent project for routing analysis. I made some progress on this, but never wrote it up.

One aspect I was really interested in was common routes. When you create a shortest path network from many points to a single destination you get a lot of lines which converge on the same place. Can we use this for anything else? Can we count how many lines share the same pathways? This would be particularly useful for prioritising investment in safe routes to school, or walking and cycling infrastructure.

I was spurred to look at this analysis again from the tweet below:

Have to question the merits of people coming from the west driving across the city centre to get to St James’s Ctr.

— Cllr Scott Arthur (@ProfScottThinks) October 6, 2017

For background, the St James’s Centre in Edinburgh is being demolished and a new shopping centre built to replace it. There’s been a lot of (justified) complaint surrounding the new road layout at Piccadilly Place, because it prioritises motorised traffic over more efficient and effective ways of transporting people in cities (foot, bike, bus and tram). On top of this, the new St James’ centre will have a greatly increased number of spaces for parked cars (1800 from 550. With 300 for residents).

Scott’s tweet made me think about catchment area for the St James’ centre. How far away are different parts of Edinburgh? Which roads would be most travelled to get there? If one increases the number of parking spaces at a location, surely the number of cars travelling to that location will increase. What will the impact on the road network be?

There are some great resources to help us with this. The data source I’m going to use is Ordnance Survey OpenData. In particular we need Open Roads and CodePoint (postcodes). Software I’ll use for this task are QGIS, GRASS GIS and R. These are open source and require a little practice to use, but the effort pays off!

After reading data into GRASS I calculated the distance network from all postcodes to the St James’ Centre:

# connect postcodes to streets as layer 2 v.net --overwrite input=roads points=postcodes output=roads_net1 operation=connect thresh=400 arc_layer=1 node_layer=2 # connect st james to streets as layer 3 v.net --overwrite input=roads_net1 points=stjames output=roads_net2 operation=connect thresh=400 arc_layer=1 node_layer=3 # shortest paths from postcodes (points in layer 2) to st james (points in layer 3) v.net.distance in=roads_net2 out=pc_2_stjames flayer=2 to_layer=3 # Join postcode and distance tables v.db.join map=postcodes column=cat other_table=pc_2_stjames other_column=cat # Make a km column v.db.addcolumn map=postcodes columns="dist_km double precision" v.db.update map=postcodes column=dist_km qcol="dist/1000" # Write to shp (for mapping in QGIS) v.out.ogr -s input=postcodes output=pc_2_stjames format=ESRI_Shapefile output_layer=pc_2_station

We can see the results of this below:

We can also zoom this map on the area inside the bypass and adjust the colouring to give us more resolution for shorter distances.

Now we can use R to investigate how many postcodes are within these distances. R can talk directly to GRASS, or we can use the exported shp file.

library(rgdal) # Read shp file # Note OGR wants the full path to the file, you can't use '~' postcodes = readOGR("/home/me/Projects/network/data/pc_2_stjames.shp") hist(postcodes$dist_km, breaks=50) # Rank the distances x = sort(postcodes$dist_km) png("~/Cloud/Michael/Projects/network/figures/stjames_postcode-distance.png", height=600, width=800) par(cex=1.5) plot(x, type="l", main="EH postcode: shortest road distances to St James centre", xlab="Number of postcodes", ylab="Distance (km)", lwd=3) points(max(which(x<2)), 2, pch=19, cex=2, col="purple4") points(max(which(x<5)), 5, pch=19, cex=2, col="darkorange") legend("topleft", c(paste(max(which(x<2)), "postcodes within 2 km (walking distance)"), paste(max(which(x<5)), "postcodes within 5 km (cycling distance)")), col=c("purple4", "darkorange"), pch=19, pt.cex=2) dev.off() # Get percentages round(100 * max(which(x<2)) / length(x)) round(100 * (max(which(x<5)) - max(which(x<2))) / length(x))

As we can see, 1868 postcodes are within walking distance of the St James’ Centre and 7044 within cycling distance. If we

- Ignore the other shopping centres around Edinburgh,
- Assume that population density is consistent between postcodes,
- Take the complete EH postcode locations as the catchment area,

then 8 % of total customers are within walking distance (2 km), a further 22 % are within cycling distance (2-5 km) and the remaining 70 % would need to travel by other means. I wonder if there will be 330 cycle parking spaces at the St James centre? This would be proportional to the number of car parking spaces. This assumes that those beyond cycling distance need to drive, but much of the population will be near a bus stop.

Additional work:

- Create an allocation network for all Edinburgh shopping centres with car parks. What is St James’ catchment from that?
- Use the census data to compare car ownership with the St James’ allocation network. Car ownership is probably quite low in the city centre!
- Consider travel time, not just distance.
- How many of those outside walking/cycling distance are near a bus stop?

**leave a comment**for the author, please follow the link and comment on their blog:

**R – scottishsnow**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...