Text Mining the NZ Road Network with R

October 2, 2015
By

(This article was first published on some real numbers, and kindly contributed to R-bloggers)

What are the most common words in New Zealand road names? Are there any common themes?

Thankfully, New Zealand’s 73,906 current road names have been made available through the LINZ Data Service. To answer the questions above, we can use R’s tm package to conduct basic text mining.

The process is simple*. Text is cleansed of any punctuation, extra white-space, redundant or uninteresting words before being fed into wordcloud(). The 60 most common words are then displayed with size proportional to frequency of occurrence.

wordcloud
The 60 most common words found in New Zealand road names

Can we see any common themes? Yes, namely:

1. Royalty and famous Britons: George, King, Victoria, Queen, Elizabeth, Albert, Nelson.

Queen Victoria, Reigned 1837 - 1901. Source: wikimedia.org
Queen Victoria, Reigned 1837 – 1901. Source: wikimedia.org

2. Early New Zealanders: Campbell, Russel, Grey, Scott.

Sir George Grey, 3rd Governor of New Zealand, In office 1845 – 1854. Source: wikimedia.org
Sir George Grey, 3rd Governor of New Zealand, In office 1845 – 1854. Source: wikimedia.org

3. Native trees: Kowhai, Totara, Rata, Rimu, Matai, Kauri, Miro.

A pair of Kauri trees. Source: wikimedia.org
A pair of Kauri trees. Source: wikimedia.org

4. Not-so native trees: Pine, Oak.

5. Native birds: Tui, Huia, Kiwi.

The now-extinct Huia (male and female). Source: wikimedia.org
The now-extinct Huia (male and female). Source: wikimedia.org

*This blog post from deltaDNA served as a guide.

References:
https://deltadna.com/blog/text-mining-in-r-for-term-frequency/
https://cran.r-project.org/web/packages/tm/index.html
https://cran.r-project.org/web/packages/wordcloud/index.html
Landonline: Road Name. Source: LINZ/Full Landonline Dataset: https://data.linz.govt.nz/table/2024-landonline-road-name/
ASP: Street Type. Source: LINZ/Electoral: https://data.linz.govt.nz/table/1210-asp-street-type/

To leave a comment for the author, please follow the link and comment on their blog: some real numbers.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)