Recursive Partitioning and Regression Trees Exercises

December 13, 2016
By

(This article was first published on R-exercises, and kindly contributed to R-bloggers)

tree-538274_960_720

[For this exercise, we will work using the package rpart. This is a beginner level exercise. Please refer to the help of rpart package]

Answers to the exercises are available here.

Exercise 1

Consider the Kyphosis data frame(type help(‘kyphosis’) for more details), that contains:
-Kyphosis:a factor with levels absent present indicating if a kyphosis (a type of deformation) was present after the operation.
-Age:in months.
-Number:the number of vertebrae involved.
-Start:the number of the first (topmost) vertebra operated on.

1) Build a tree to classify Kyphosis from Age, Number and Start.

Exercise 2

Consider the tree build in exercise 1.
1) Which variables are used to explain kyhosis presence?
2) How many observations contains the terminal nodes.

Exercise 3

Consider the Kyphosis data frame.
1)Build a tree using the first 60 observations of kyphosis.
2)Predict the kyphosis presence for the other 21 observations.
3)Which is the misclassification rate (prediction error)

Exercise 4

Consider the iris data frame(type help(‘iris’) for more details).
1)Build a tree to classify Species from the other variables.
2)Plot the trees, add nodes information.

Exercise 5

Consider the tree build in exercise 4.
Prune the the using median complexity parameter (cp) associated to the tree.
Plot in the same window, the pruned and the original tree.

Exercise 6

Consider the tree build in exercise 4.
1)In which terminal nodes is clasified each oobservations of iris?
2)Which Specie has a flower of Petal.Length greater than 2.45 and Petal.Width less than 1.75.

Exercise 7

Consider the car90 data frame(type help(‘car90’) for more details).
1)Build a tree to predict Price from the other variables.
2)Plot the trees, add nodes information.

Exercise 8

Consider the tree build in exercise 7.
1) Which variables are used to explain the price?
2)Which terminal nodes have a value of mean Price, less tan mean(car90$Price)?

Exercise 9

Consider the car.test.frame data frame (type help(‘car.test.frame’) for more details).
1)Build a tree to explain Mileage using the other variables.
2)Snip the tree in nodes number 2.
3)Plot both tree together

Exercise 10
Consider the tree build in exercise 9.
Which is the depth of the tree (with the root node counted as depth 0).
Set the maximum depth of the final tree on 2

To leave a comment for the author, please follow the link and comment on their blog: R-exercises.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)