How successful can an R meetup be? meet(R) in Tricity! – RSelenium and Big Data processing

February 6, 2017
By

(This article was first published on http://r-addict.com, and kindly contributed to R-bloggers)

At Thursday (12.01.2017) we had a chance to attend the first TriCity R Users Group (Pomerania, Poland) meeting. The meetup was unexpectedly very successful! The success can be measured in the time attendees spent on ardently comments and questions after each of 2 great presentations. After every 20-25 min long presentation we could observe 30 min long lively discussion! It is amazing that questions lasted longer than presentations. Is it thanks to the climate? Is it due to the nature of a Pomeranian community? Perhaps this is due to excellent organization? In this post I present summary of the meeting, I describe presentations and reveal organizers’ identity.

Organizers

TriCity R Users Group organizers: Anna Rybinska (University of Gdansk), Agnieszka Borsuk (Medical University of Gdansk) and Emilia Daghir-Wojtkowiak (Medical University of Gdansk) met during European R Users Meeting, Poznan 2016 (Emilia knew Agnieszka earlier) where they decided to co-organize R meetings in TriCity. They had an excellent attitude across whole meetup. Their idea to let every attendee to introduce himself/herself was a blast to the meeting’s friendly atmosphere. Attendees had a straight possibility to get to know other people interested in analytics from the business or the academia.

Organizers also took care to inform about next upcoming meeting in that group Shiny apps and more…. The flight from my home city for that future meeting costs about 2$ and lasts shorter than an hour – maybe you will also attend in the era of cheap flights?

Speakers

The meeting would not be so great without speakers! Ania, Agnieszka and Emilia invited 2 interesting and energetic orators: Michal Maj and Krzysztof Slomczynski.

Michal presented his experiences and a view on challenges that he faces as a Data Scientist in the biggest polish news portal, wp.pl, where he tries to provide a solution for advertising campaigns recommendation. Listeners could understand daily challenges associated with processing big data with Apache Spark or Apache kafka. It appeard that those masive datasets, at the end of the day, are analyzed in R and visualized in shiny. A real data science dream: R and petabytes of data! Questions about statistical learning algorithms applied in wp.pl lasted veeery long as Michal desrcibed them with a mathematical precision.

The second presentation was given by Krzysztof. He talked about his journey as a web-scraper. For the last year he has been working in 3 various web scraping projects. During one of them he created a system that is tracing what skills are currently in demand among job offers for data scientists in Poland (Is it a job offer for a Data Scientist?). Krzysztof not only gives the impression of an expert on the internet data collection but in fact he truly IS a skilled young web-scraping expert. He presented RSelenium package which he compared to rvest. You could sense Krzysztof’s strong emphasis on listing DOs and DONTs for the web-harvesting. Perfectly! Expert’s DOs and DONTs are what we are attending R meetups for. In addition, listeners had a chance to understand Docker and it’s usage for Selenium Server configuration. I must admit that he knows a lot as for a young Data Analyst.

If you are interested in RSelenium you might also check: Controlling Expenses on Ali Express with RSelenium

I hope the next TriCity tigeR meetup will be as successful as this one and that attendees will ask even more questions and provide greater feedback with their comments.

Materials

Presentations were in polish. They are available on trigeR’s GitHub repository and on speakers’ websites here and here

To leave a comment for the author, please follow the link and comment on their blog: http://r-addict.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)