Home > Software engineering >  Change formatting for branch names in rpart plot
Change formatting for branch names in rpart plot

Time:09-24

When using rpart to create and plot trees there are a number of functions which can alter the final appearance, however it appears nothing built in which allows for formatting the branch names. Below is an example of (A) what happens normally, and (B) when trying to alter the names using split.fun, and the code to produce this plot.

test <- list()
test$tree <- rpart(Species ~ ., data = iris)
par(mfrow = c(1,2))
rpart.plot(test$tree, type=5, extra=2)
rpart.plot(test$tree, type=5, extra=2, split.fun = function(x, labs, digits, varlen, faclen){
  labs <- gsub(".", " ", labs)
  labs
})

Two trees, both wrong

What I am after is for the Petal.Length and Petal.Width to instead be displayed as Petal Length and Petal Width. Is there any code that can achieve this seemingly simple task?

CodePudding user response:

To get what you want I am offering a hack. Not pretty, but it does the job.

If you look at the tree structure, those labels come from test$tree$frame$var. So you can simply change those in the tree.

par(mfrow = c(1,2))
rpart.plot(test$tree, type=5, extra=2)
test$tree$frame$var = sub("\\.", " ", test$tree$frame$var)
rpart.plot(test$tree, type=5, extra=2)

Tree plot with and without periods in the variable names

  • Related