ELK+R Stack

January 24, 2019
By

(This article was first published on R – Paolo Eusebi, and kindly contributed to R-bloggers)

Elasticsearch is a search engine based on the Lucene library. It provides a distributed full-text search engine with an HTTP web interface and schema-free JSON documents.

Elasticsearch is becoming the bigger player in the technology for documents search in the noSQL space and is actually experiencing a great development phase (6 versions in few years and an exponentially growth of the community).

I have installed elasticsearch version 6.5.4 and kibana (aligned version 6.5.4) on my Mac where I have installed R software version 3.5.

First impact: I have worked few hours in trying to have everything fine installed on my machine. In order to work with such technologies a limited set of hacking skills are required.

For installing both elasticsearch and kibana I have followed the instructions on the elastic website.

I don’t spend here much time on installation issue due to the fact that they are all strongly dependent on operating systems and personal skills. Documentation and online forums will assist you in case of any problem.

After installation kibana was alive and kicking at http://localhost:5601/app/kibana

So, time is come to feed elasticsearch with some data.

I have suddenly thought to the NYC flight dataset available in the nycflights13 package, including on-time data for all flights that departed NYC (i.e. JFK, LGA or EWR) in 2013.

library(nycflights13)
data(flights)
flights

I have installed the elastic package from CRAN

The connect command established the connection with my local elasticsearch

library(elastic)
connect()

Then I sent the data frame to elasticsearch with the simple bulk command

docs_bulk(flights, index = "flights_nyc_2013_idx")

The index argument provide the index name to use and is strictly required for data.frame input (optional for file inputs).

Opening kibana everything was ok and ready to play with

schermata 2019-01-24 alle 16.09.31

Then I have started to work on kibana for creating a dashboard for having useful insights from data. Not surprisingly june, july and December were the months at greater risk of delayed arrivals. Visualization and dashboard are ready for being included in websites trought specific iframe.

schermata 2019-01-24 alle 15.50.16

To leave a comment for the author, please follow the link and comment on their blog: R – Paolo Eusebi.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)