How Scenic is the HS2 Route?

October 29, 2012
By

(This article was first published on Alex Singleton » R, and kindly contributed to R-bloggers)

It is fairly clear from the duration between this and my last post that various other things have been getting in the way of updates. Anyway, I shall try and post a few updates on news and things I have been working on recently in the coming weeks before getting back to regular posting!

Back in January I had a student working on a dissertation about the High Speed 2 railway. This got me thinking about what sort of data could be used to characterise the route. As it transpired there wasn’t a publicly available Shapefile of the route at the time, however, an ex-colleague (Daryl Lloyd) who by chance now works for the Department for Transport, had almost in unison realised the same thing; and indeed, on the day I had contacted him was negotiating with HS2 Ltd to release the file. This is now available to download from here.

One unusual dataset that I thought would provide interesting context is the My Society project ScenicOrNot. This application enables users to rate the level of “scenic”[ness] of a series of random georeferenced photographs taken from the Geograph project. The raw scores are available to download here. For each picture lat / lon, multiple votes were concatenated in single line. As such, the records were split up, so one each vote appeared as a single line in the exported CSV. This was done using the following R code.

#Read in Scenic Data from http://scenic.mysociety.org/
Scenic <- read.delim2("http://scenic.mysociety.org/votes.tsv", header = TRUE, sep = "\t", quote="\"")
AllVotes <- NULL
list <- for(x in 1:nrow(Scenic)) {
row <- Scenic[x,]
Lat <- row$Lat
Lon <- row$Lon
ID <- row$ID
Votes <- as.data.frame(strsplit(as.character(row$Votes),",")) # Gets the votes as a dataframe list
Votes$Lat <- Lat #Add Lat
Votes$Lon <- Lon #Add Lon
Votes$ID <- ID # Add ID
names(Votes)[1] <- "Votes" #Rename Votes list
AllVotes <- rbind(AllVotes,Votes)
rm(Votes,ID,Lat,Lon,row)
print(x)
}
AllVotes_test <- AllVotes
AllVotes_test$Lat <- as.numeric(AllVotes_test$Lat)
AllVotes_test$Lon <- as.numeric(AllVotes_test$Lon)
write.csv(AllVotes_test, file = "scenic_final_out.csv", row.names = FALSE)

The resulting CSV can be downloaded here. This relates to an extract from January 22nd 2012.

These data were then converted into OSGB and imported into a PostGIS database. A point in polygon operation was used to create average scores for a 5km grid over England. The shapefile with average votes can be downloaded here.

Created using QGIS, the following maps show the output of these analyses…




When we overlay the HS2 route onto these data we can see that this passes through areas with varying degrees of “scenic”ness.




Although these data are interesting in themselves, there is obvious utility if this sort of information was combined with other indicators such as population density and characteristics. The assumption being that all other things being equal, then people may object to disruption in those areas which they consider more “scenic”… perhaps something for further work!

To leave a comment for the author, please follow the link and comment on his blog: Alex Singleton » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.