I processed data with random forest in machine learning, used caret
and ranger
packages, got the importance of each variable through varImp
function.
imp1<- varImp(RF,scale=FALSE)
imp1
plot(imp1, top = 10)
> imp1
ranger variable importance
Overall
precipitation2008_2018 2.1755
LST2008_2018 1.8931
Tmax2008_2018 1.5757
elevation2008_2018 1.4642
NDVI2008_2018 1.0231
settlement2008_2018 1.0011
road2008_2018 0.7081
slope2008_2018 0.7047
aspect2008_2018 0.4721
TWI2008_2018 0.3114
Because I found that the variable importance does not add up to 1, I used the following code to adjust it. But when I wanted to use it to draw a graph, I found that it cannot be graphed.
imp2 <- varImp(RF,scale=TRUE)[['importance']]
imp2$Overall <- imp2$Overall/sum(imp2$Overall)
imp2
plot(imp2 ,top=10)
> imp2
Overall
Tmax2008_2018 0.15389492
precipitation2008_2018 0.22691543
NDVI2008_2018 0.08663987
LST2008_2018 0.19253375
slope2008_2018 0.04787391
elevation2008_2018 0.14032420
aspect2008_2018 0.01956714
settlement2008_2018 0.08395931
road2008_2018 0.04829147
TWI2008_2018 0.00000000
> plot(imp2, top=10)
Warning messages:
1: In plot.window(xlim, ylim, log, ...) : "top" is not a graph parameter
2: In axis(side = side, at = at, labels = labels, ...) : "top" is not a graph parameter
3: In title(xlab = xlab, ylab = ylab, ...) : "top" is not a graph parameter
4: In plot.xy(xy.coords(x, y), type = type, ...) : "top" is not a figure parameter
I found that the data format has changed.
str(imp1)
> str(imp1)
List of 3
$ importance:'data.frame': 10 obs. of 1 variable:
..$ Overall: num [1:10] 1.576 2.176 1.023 1.893 0.705 ...
$ model : chr "ranger"
$ calledFrom: chr "varImp"
- attr(*, "class")= chr "varImp.train"
str(imp2)
> str(imp2)
'data.frame': 10 obs. of 1 variable:
$ Overall: num 0.1539 0.2269 0.0866 0.1925 0.0479 ...
> dput(imp1)
structure(list(importance = structure(list(Overall = c(1.57565252079602,
2.17553558745997, 1.0231341483824, 1.89308085981913, 0.704661369152856,
1.46416539340723, 0.472113627124259, 1.00111266231049, 0.708091766750006,
0.31136434852237)), class = "data.frame", row.names = c("Tmax2008_2018",
"precipitation2008_2018", "NDVI2008_2018", "LST2008_2018", "slope2008_2018",
"elevation2008_2018", "aspect2008_2018", "settlement2008_2018",
"road2008_2018", "TWI2008_2018")), model = "ranger", calledFrom = "varImp"), class = "varImp.train")
> dput(imp2)
structure(list(Overall = c(0.153894924595083, 0.226915428412731,
0.0866398674611716, 0.192533750275507, 0.0478739077536024, 0.140324202793608,
0.0195671356037942, 0.0839593117043193, 0.048291471400184, 0)), row.names = c("Tmax2008_2018",
"precipitation2008_2018", "NDVI2008_2018", "LST2008_2018", "slope2008_2018",
"elevation2008_2018", "aspect2008_2018", "settlement2008_2018",
"road2008_2018", "TWI2008_2018"), class = "data.frame")
I want to plot a variable importance map from imp2 similar to the imp1 above. How should I do it? Thanks.
CodePudding user response:
You could do the modification in your imp1 dataframe, because it needs to keep the same class of varImp.train. imp2 is a dataframe class and that's why it doesn't plot anymore. You can use the following code:
imp1 <- structure(list(importance = structure(list(Overall = c(1.57565252079602,
2.17553558745997, 1.0231341483824, 1.89308085981913, 0.704661369152856,
1.46416539340723, 0.472113627124259, 1.00111266231049, 0.708091766750006,
0.31136434852237)), class = "data.frame", row.names = c("Tmax2008_2018",
"precipitation2008_2018", "NDVI2008_2018", "LST2008_2018", "slope2008_2018",
"elevation2008_2018", "aspect2008_2018", "settlement2008_2018",
"road2008_2018", "TWI2008_2018")), model = "ranger", calledFrom = "varImp"), class = "varImp.train")
imp1$importance$Overall <- imp1$importance$Overall/sum(imp1$importance$Overall)
plot(imp1)
Output: