Site icon R-bloggers

Recursive Partitioning and Regression Trees Exercises

[This article was first published on R-exercises, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

[For this exercise, we will work using the package rpart. This is a beginner level exercise. Please refer to the help of rpart package]

Answers to the exercises are available here.

Exercise 1

Consider the Kyphosis data frame(type help(‘kyphosis’) for more details), that contains:
-Kyphosis:a factor with levels absent present indicating if a kyphosis (a type of deformation) was present after the operation.
-Age:in months.
-Number:the number of vertebrae involved.
-Start:the number of the first (topmost) vertebra operated on.

1) Build a tree to classify Kyphosis from Age, Number and Start.

Exercise 2

Consider the tree build in exercise 1.
1) Which variables are used to explain kyhosis presence?
2) How many observations contains the terminal nodes.

Exercise 3

Consider the Kyphosis data frame.
1)Build a tree using the first 60 observations of kyphosis.
2)Predict the kyphosis presence for the other 21 observations.
3)Which is the misclassification rate (prediction error)

Exercise 4

Consider the iris data frame(type help(‘iris’) for more details).
1)Build a tree to classify Species from the other variables.
2)Plot the trees, add nodes information.

Exercise 5

Consider the tree build in exercise 4.
Prune the the using median complexity parameter (cp) associated to the tree.
Plot in the same window, the pruned and the original tree.

Exercise 6

Consider the tree build in exercise 4.
1)In which terminal nodes is clasified each oobservations of iris?
2)Which Specie has a flower of Petal.Length greater than 2.45 and Petal.Width less than 1.75.

Exercise 7

Consider the car90 data frame(type help(‘car90’) for more details).
1)Build a tree to predict Price from the other variables.
2)Plot the trees, add nodes information.

Exercise 8

Consider the tree build in exercise 7.
1) Which variables are used to explain the price?
2)Which terminal nodes have a value of mean Price, less tan mean(car90$Price)?

Exercise 9

Consider the car.test.frame data frame (type help(‘car.test.frame’) for more details).
1)Build a tree to explain Mileage using the other variables.
2)Snip the tree in nodes number 2.
3)Plot both tree together

Exercise 10
Consider the tree build in exercise 9.
Which is the depth of the tree (with the root node counted as depth 0).
Set the maximum depth of the final tree on 2

To leave a comment for the author, please follow the link and comment on their blog: R-exercises.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.