Generating Nice Looking Tree Diagrams in R

[This article was first published on Mario Segal - Professional Site » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This function generates nice looking tree diagrams (see sample) below from tree objects (generated by package tree).

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    <http://www.gnu.org/licenses/>.
    Developed by Mario Segal

plotTree <- function(x) {

require(ggdendro)
require(odfWeave)
require(tree)
require(ggplot2)
require(plyr)

#get tree data in a form suitable for ggdendro;
tree_data <- dendro_data(x)
#get labels and splits and modify labels to include splits, also add Ns for terminal nodes
#and apply formating for splits
labels <- as.matrix(as.character(tree_data$labels$label),nrow=1)
splits <- as.matrix(x$frame$splits[,1][as.numeric(row.names(tree_data$labels))],nrow=1)
splits <- prettyNum(splits,big.mark=”,”,scientific=F,digits=2,format=”f”)
ns <- as.matrix(x$frame$n[as.numeric(row.names(tree_data$leaf_labels))],nrow=1)
new_labels <- matrixPaste(as.character(tree_data$labels$label),splits,sep=”\n”)
tree_data$labels$label <- new_labels
new_ends <- paste(as.character(tree_data$leaf_labels$label),” (N=”,as.character(ns),”)”,sep=”")
tree_data$leaf_label$labels <- new_ends

range <- max(tree_data$segments$y)-min(tree_data$segments$y)
my_limits = c(min(tree_data$segments$y)-0.25*range,max(tree_data$segments$y)+0.25*range)
my_limits[1] = round_any(my_limits[1],10,f=floor)
my_limits[2] = round_any(my_limits[2],10,f=ceiling)
#create the ggplot chart (parts of code copied from ggdendro documentation);
plot_x <- plot_x <- ggplot(segment(tree_data)) +geom_segment(aes(x=x, y=y, xend=xend, yend=yend),colour=”blue”, alpha=0.5) +theme_dendro()
plot_x <- plot_x +geom_text(data=label(tree_data),aes(x=x, y=y, label=label), vjust=-0.5, size=3)
plot_x <- plot_x +geom_text(data=tree_data$leaf_label,aes(x=x, y=y, label=labels,color=label), vjust=0.5, hjust=1, size=3,angle=90)
plot_x <- plot_x + scale_y_continuous(limits=my_limits)+theme(legend.position=”none”)
#colors for terminal node labels are optional but very nice, they can be custom defined as below;
#or if the next line is ignored they will be ggplot default colors;
plot_x <- plot_x + scale_color_manual(values=c(“#FFB300″,”#007856″,”#C3E76F”,”#86499D”,”#003359″,”#AFAAA3″))
print(plot_x)
return(plot_x)
}

Sample Output using a tree developed with the IRIS dataset

Ex1


To leave a comment for the author, please follow the link and comment on their blog: Mario Segal - Professional Site » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)