Visualizing dataset to apply machine learning-exercises

September 8, 2017

(This article was first published on R-exercises, and kindly contributed to R-bloggers)


Dear reader,

If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world.

This post includes a full machine learning project that will guide you step by step to create a “template,” which you can use later on other datasets.

Before proceeding, please follow our short tutorial.

Look at the examples given and try to understand the logic behind them. Then try to solve the exercises below using R and without looking at the answers. Then see solutions to check your answers.

Exercise 1

Create a variable “x” and attach to it the input attributes of the “iris” dataset. HINT: Use columns 1 to 4.

Exercise 2

Create a variable “y” and attach to it the output attribute of the “iris” dataset. HINT: Use column 5.

Exercise 3

Create a whisker plot (boxplot) for the variable of the first column of the “iris” dataset. HINT: Use boxplot().

Exercise 4

Now create a whisker plot for each one of the four input variables of the “iris” dataset in one image. HINT: Use par().

Learn more about machine learning in the online course Beginner to Advanced Guide on Machine Learning with R Tool. In this course you will learn how to:

  • Create a machine learning algorithm from a beginner point of view
  • Quickly dive into more advanced methods in an accessible pace and with more explanations
  • And much more

This course shows a complete workflow start to finish. It is a great introduction and fallback when you have some experience.

Exercise 5

Create a barplot to breakdown your output attribute. HINT: Use plot().

Exercise 6

Create a scatterplot matrix of the “iris” dataset using the “x” and “y” variables. HINT: Use featurePlot().

Exercise 7

Create a scatterplot matrix with ellipses around each separated group. HINT: Use plot="ellipse".

Exercise 8

Create box and whisker plots of each input variable again, but this time broken down into separated plots for each class. HINT: Use plot="box".

Exercise 9

Create a list named “scales” that includes the “x” and “y” variables and set relation to “free” for both of them. HINT: Use list()

Exercise 10

Create a density plot matrix for each attribute by class value. HINT: Use featurePlot().

To leave a comment for the author, please follow the link and comment on their blog: R-exercises. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)