Home > database >  Plot about Machine Learning Variable Importance
Plot about Machine Learning Variable Importance

Time:08-13

I processed data with random forest in machine learning, used caret and ranger packages, got the importance of each variable through varImp function.

imp1<- varImp(RF,scale=FALSE)
imp1
plot(imp1, top = 10)

> imp1
ranger variable importance

                       Overall
precipitation2008_2018  2.1755
LST2008_2018            1.8931
Tmax2008_2018           1.5757
elevation2008_2018      1.4642
NDVI2008_2018           1.0231
settlement2008_2018     1.0011
road2008_2018           0.7081
slope2008_2018          0.7047
aspect2008_2018         0.4721
TWI2008_2018            0.3114

enter image description here

Because I found that the variable importance does not add up to 1, I used the following code to adjust it. But when I wanted to use it to draw a graph, I found that it cannot be graphed.

imp2 <- varImp(RF,scale=TRUE)[['importance']]
imp2$Overall <- imp2$Overall/sum(imp2$Overall)
imp2 
plot(imp2 ,top=10)
> imp2
                          Overall
Tmax2008_2018          0.15389492
precipitation2008_2018 0.22691543
NDVI2008_2018          0.08663987
LST2008_2018           0.19253375
slope2008_2018         0.04787391
elevation2008_2018     0.14032420
aspect2008_2018        0.01956714
settlement2008_2018    0.08395931
road2008_2018          0.04829147
TWI2008_2018           0.00000000

> plot(imp2, top=10)
Warning messages:
1: In plot.window(xlim, ylim, log, ...) : "top" is not a graph parameter
2: In axis(side = side, at = at, labels = labels, ...) : "top" is not a graph parameter
3: In title(xlab = xlab, ylab = ylab, ...) : "top" is not a graph parameter
4: In plot.xy(xy.coords(x, y), type = type, ...) : "top" is not a figure parameter

I found that the data format has changed.

str(imp1)
> str(imp1)
List of 3
 $ importance:'data.frame': 10 obs. of  1 variable:
  ..$ Overall: num [1:10] 1.576 2.176 1.023 1.893 0.705 ...
 $ model     : chr "ranger"
 $ calledFrom: chr "varImp"
 - attr(*, "class")= chr "varImp.train"

str(imp2)
> str(imp2)
'data.frame':   10 obs. of  1 variable:
 $ Overall: num  0.1539 0.2269 0.0866 0.1925 0.0479 ...

> dput(imp1)
structure(list(importance = structure(list(Overall = c(1.57565252079602, 
2.17553558745997, 1.0231341483824, 1.89308085981913, 0.704661369152856, 
1.46416539340723, 0.472113627124259, 1.00111266231049, 0.708091766750006, 
0.31136434852237)), class = "data.frame", row.names = c("Tmax2008_2018", 
"precipitation2008_2018", "NDVI2008_2018", "LST2008_2018", "slope2008_2018", 
"elevation2008_2018", "aspect2008_2018", "settlement2008_2018", 
"road2008_2018", "TWI2008_2018")), model = "ranger", calledFrom = "varImp"), class = "varImp.train")


> dput(imp2)
structure(list(Overall = c(0.153894924595083, 0.226915428412731, 
0.0866398674611716, 0.192533750275507, 0.0478739077536024, 0.140324202793608, 
0.0195671356037942, 0.0839593117043193, 0.048291471400184, 0)), row.names = c("Tmax2008_2018", 
"precipitation2008_2018", "NDVI2008_2018", "LST2008_2018", "slope2008_2018", 
"elevation2008_2018", "aspect2008_2018", "settlement2008_2018", 
"road2008_2018", "TWI2008_2018"), class = "data.frame")

I want to plot a variable importance map from imp2 similar to the imp1 above. How should I do it? Thanks.

CodePudding user response:

You could do the modification in your imp1 dataframe, because it needs to keep the same class of varImp.train. imp2 is a dataframe class and that's why it doesn't plot anymore. You can use the following code:

imp1 <- structure(list(importance = structure(list(Overall = c(1.57565252079602, 
                                                               2.17553558745997, 1.0231341483824, 1.89308085981913, 0.704661369152856, 
                                                               1.46416539340723, 0.472113627124259, 1.00111266231049, 0.708091766750006, 
                                                               0.31136434852237)), class = "data.frame", row.names = c("Tmax2008_2018", 
                                                                                                                       "precipitation2008_2018", "NDVI2008_2018", "LST2008_2018", "slope2008_2018", 
                                                                                                                       "elevation2008_2018", "aspect2008_2018", "settlement2008_2018", 
                                                                                                                       "road2008_2018", "TWI2008_2018")), model = "ranger", calledFrom = "varImp"), class = "varImp.train")

imp1$importance$Overall <- imp1$importance$Overall/sum(imp1$importance$Overall)
plot(imp1)

Output:

enter image description here

  •  Tags:  
  • r
  • Related