Is it a job offer for a Data Scientist?

January 10, 2017

(This article was first published on » English, and kindly contributed to R-bloggers)

Konrad Więcko and Krzysztof Słomczyński (with tiny help from my side) have created a system that is tracing what skills are currently in demand among job offers for data scientists in Poland. What skills, how frequent and how the demand is changing over time.

The full description how this was done. static, +shiny.

Here: The shiny application for browsing skill sets.

Here: The R package that allows to access the live data.

Full version
A data science track (MSc level) will be very soon offered at MiNI department/Warsaw University of Technology and we (i.e. program committee) are spending a lot of time setting up the program. How cool it would be taking into account the current (and forecasted) demand on data science skills on the market? The project executed by Konrad Więcko and Krzysztof Słomczyński is dealing exactly with this problem.

So, the data is scrapped from, one of the most popular (in Poland) websites with job offers.
Only offers interested to data scientists were used for further analyses.
How these offers were identified? In Poland the job title ‘Data Scientist’ is (still) not that common (except linkedin). And there are different translations and also there is a lot of different job titles that may be interested for a data scientist.
So, Konrad and Krzysztof have developed a machine learning algorithm that scores how much a given offer matches a ‘data scientist profile’ (whatever that means) based on the content of job offer description. And then, based on these predictions, various statistics may be calculated, like: number of offers, locations, demand on skills etc.

Here: Description of the dataset used for the model training.

Here: Description of the machine learning part of the project.

Trends for selected skill sets.

To leave a comment for the author, please follow the link and comment on their blog: » English. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)