New package "SparkRext" – SparkR extension for closer to dplyr

July 11, 2015

(This article was first published on HOXO-M - anonymous data analyst group in Japan - , and kindly contributed to R-bloggers)

Apache Spark is one of the hottest products in data science.
Spark 1.4.0 has formally adopted SparkR package which enables to handle Spark DataFrames on R.

SparkR is very useful and powerful.
One of the reasons is that SparkR DataFrames present an API similar to dplyr.
We launced our new package “SparkRext” to redefine the functions of SparkR to enable NSE(Non stanard Evaluation) inputs. As a result, the functions will be able to be used in the same way as dplyr.

If you want to know about SparkRext package in detail, please check our blog post here.

To leave a comment for the author, please follow the link and comment on their blog: HOXO-M - anonymous data analyst group in Japan - . offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)