Automatic Notice When Vacancy Available

February 26, 2013
By

(This article was first published on Category: R | Huidong Tian's Blog, and kindly contributed to R-bloggers)

Today, I visited a webpage inadvertently and found several job positions that I am competent with, unfortunately all of them has expired. How many chances we lost in this way?! So I decide to do somthing to limit this kind of loss, and of course using our smart R!

The idea is simple: check the job vacancy webpages reguarly, if find some positions open the webpages or/and send an notice to my email.

Let’s take the vacancy page of Department of Biosciences, UiO as an example. The webpage contains the positions have not expired, for this kind of webpage, we can use the following code:

Download Webpage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
workspace <- “C:/Users”
file_outdate <- paste(workspace, “outdate.html”, sep =/)
file_updated <- paste(workspace, “updated.html”, sep =/)
URL &lt;- “http://www.mn.uio.no/ibv/english/about/vacancies/</p>

<p>if (file.exists(file_outdate)) {
  download.file(URL, file_updated)
} else {
  download.file(URL, file_outdate)
  download.file(URL, file_updated)
}</p>

<p>html_outdate &lt;- readLines(file_outdate)
html_updated &lt;- readLines(file_updated)</p>

<p>

Extract Position Titles
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
</p>

<p>Items &lt;- function(str = html_outdate) {
  # Regular expression;
  ptn &lt;- “item-title.+?&gt;(.+?)&lt;/a&gt;  HTML_Date &lt;- grep(ptn, str, value = TRUE)
  # First time to use sapply by setting FUN as “[”, cool!
  sapply(regmatches(HTML_Date, regexec(ptn, HTML_Date)),[, 2)
}</p>

<h1 id="new-position-available-or-not">New position available or not;</h1>
<p>boo &lt;- any(!Items(str = html_updated) %in% Items(str = html_outdate))</p>

<h1 id="remove-the-html-file-out-of-date">Remove the html file out of date;</h1>
<p>file.remove(file_outdate)
file.rename(file_updated, file_outdate)</p>

<p>

Display and Send Email
1
2
3
4
5
6
7
if (boo) {
  browseURL(URL)
  library(mail) # Need to install this package first;
  sendmail(“you@gmail.com”, subject= “Vancancy”, message = URL)
}</p>

<p>

The difficult part is to assemble the regular expression, and I have writen a tutorial on that topic. The last step is to run above code in a batch mode.

To leave a comment for the author, please follow the link and comment on his blog: Category: R | Huidong Tian's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.