Advent of 2021, Day 6 – Setting up IDE

[This article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Series of Apache Spark posts:

Let’s look into the IDE that can be used to run Spark.

Remember that Spark can be used with languages: Scala, Java, R, Python and each give you different IDE and different installations.

Jupyter Notebooks

Start Jupyter Notebooks and create a new notebook and you can connect to Local Spark installation.

For the testing purposes you can add code like:

spark = SparkSession.builder.set_master("spark://tomazs-MacBook-Air.local:7077")

And start working with the Spark code.

Python

In Python, you can open a PyCharm or Spyder and start working with python code:

import findspark
findspark.init("/opt/spark")
from pyspark import SparkContext

sc = SparkContext(appName="SampleLambda")
x = sc.parallelize([1, 2, 3, 4])
res = x.filter(lambda x: (x % 2 == 0))
print(res.collect())
sc.stop()

R

Open RStudio and install sparkly package, create a context and run a simple R script:

# install
devtools::install_github("rstudio/sparklyr")
spark_disconnect(sc)

# install local version
spark_install(version = "2.2.0")

# Create a local Spark master 
sc <- spark_connec(master = "local")

iris_tbl <- copy_to(sc, iris)
iris_tbl

spark_disconnect(sc)

There you go. This part was fairly short but crucial for coding.

Tomorrow we will start exploring spark code. 🙂

Compete set of code, documents, notebooks, and all of the materials will be available at the Github repository: https://github.com/tomaztk/Spark-for-data-engineers

Happy Spark Advent of 2021! 🙂

To leave a comment for the author, please follow the link and comment on their blog: R – TomazTsql.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)