# Geographic data to service the needs of a remote employee – part2

**Mango Solutions » R Blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

*Ava Yang, Mango Solutions*

**Recap**

In part 1 of this post I set out to find a flat to rent based on three simple criteria:

- Café density
- Tube station density
- Monthly rent

So far I have made use of the baidumap and REmap packages to create a nice visualisation of available flats and coffee shops in Shanghai.

**Calculation and scoring**

Now let’s do some basic math and programming. Three measures ( derived from original variables to quantify my preferences.

For density of café and tube station, the closer the better; the more the better. Geographic distances were calculated by function distm from package geosphere.

library(dplyr) library(geosphere) library(knitr) library(baidumap) load('data/ziroom.rds') # raw data load("data/sh_cafe.rds") load("data/sh_station.rds") # 1. Generate names to represent flats # 2. Extract longitude and lattitude sh_ziroom <- ziroom %>% mutate(name=paste("Room", rownames(ziroom), sep="_")) %>% mutate(lon=getCoordinate(flat, city="上海", formatted = T)[, 'longtitude']) %>% mutate(lat=getCoordinate(flat, city="上海", formatted = T)[, 'latitude']) %>% na.omit() %>% select(c(lon, lat, name, price_promotion, flat)) # distance matrices: between cafe and flat, between station and flat dist_cafe_flat <- distm(sh_cafe[,c("lon", "lat")], sh_ziroom[,c("lon", "lat")]) %>% as.data.frame() dist_station_flat <- distm(sh_station[,c("lon", "lat")], sh_ziroom[,c("lon", "lat")]) %>% as.data.frame()

As an upper limit I’m willing to walk as far as 750 metres (about 0.5 mile) from a café. Thus, cafe_{idx} and station_{idx} were then given by

For this job I wrote a small custom function called calIdx.

# Function to calculate cafe_idx and station_idx calIdx <- function(tmpcol) { tmpcol <- tmpcol[which(tmpcol < 750)] return(sum(1/log(tmpcol))) }

Rent is a negative indicator, and so rent_{idx} could be obtained from

The weighted score was calculated by

# 1. cafeIdx = 1/log(dis1) + 1/log(dis2) +...+ 1/log(disN) # 2. stationIdx = 1/log(dis1) + 1/log(dis2) +...+ 1/log(disN) # 3. rentIdx = 1/log(price_promotion) # 4. score = 0.3*cafeIdx + 0.2*stationIdx + 0.5*rentIdx sh_ziroom_top10 <- sh_ziroom %>% mutate(cafeIdx = sapply(dist_cafe_flat, calIdx)) %>% mutate(stationIdx = sapply(dist_station_flat, calIdx)) %>% filter(price_promotion <= 4000) %>% mutate(rentIdx = 1/log(as.numeric(price_promotion))) %>% mutate(score = 0.4*cafeIdx + 0.2*stationIdx + 0.4*rentIdx) %>% arrange(desc(score)) %>% slice(1:10)

**Summary**

kable(sh_ziroom_top10[, c("name", "score", "cafeIdx", "stationIdx", "rentIdx")], align="c") name score cafeIdx stationIdx rentIdx Room_34 0.6480966 1.3262957 0.3380904 0.1249006 Room_35 0.6470510 1.3262957 0.3380904 0.1222865 Room_80 0.6054141 1.2216344 0.3378458 0.1229781 Room_79 0.6048128 1.2216344 0.3378458 0.1214746 Room_22 0.5729428 1.1430015 0.3349634 0.1218737 Room_24 0.5729428 1.1430015 0.3349634 0.1218737 Room_45 0.5292036 0.9617378 0.4709076 0.1258173 Room_46 0.5284566 0.9617378 0.4709076 0.1239499 Room_59 0.4334636 0.8012006 0.3237803 0.1205684 Room_57 0.3836545 0.6721137 0.3302977 0.1218737

Done! See above for the top 10 room candidates. The mechanism I used is not difficult and makes my life so much easier. Moving to a new area which fulfils all my social needs is no longer such a big challenge!

**leave a comment**for the author, please follow the link and comment on their blog:

**Mango Solutions » R Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.