Home > OS >  Error: Missing data in columns: pop when running random forest regression using the ranger package
Error: Missing data in columns: pop when running random forest regression using the ranger package

Time:01-21

I am trying to implement random forest (RF) regression using the ranger package in R, but I am getting this error: Error: Missing data in columns: pop (pop is my independent variable) when running the predict function.

For reference, when using the randomForest package, I can use the na.action = na.omit function to exclude the NA values, but in ranger I can't do this.

library(terra)
s <- rast(system.file("ex/logo.tif", package="terra")) [[1:2]] 
names(s) = c("ntl", "covar")
s[10:20, ] <- NA
 
library(ranger)
m <- ranger(ntl~., data=as.data.frame(s, na.rm=TRUE), mtry=1)
p <- predict(s, m)
#Error: Missing data in columns: covar.
#In addition: Warning message:
#In lapply(r, as.numeric) : NAs introduced by coercion

CodePudding user response:

You can use na.rm=TRUE

library(terra)
#terra 1.6.53
s <- rast(system.file("ex/logo.tif", package="terra")) [[1:2]] 
names(s) = c("ntl", "covar")
s[10:20, ] <- NA

library(ranger)
m <- ranger(ntl~., data=as.data.frame(s, na.rm=TRUE), mtry=1)
p <- predict(s, m, na.rm=TRUE)

p
#class       : SpatRaster 
#dimensions  : 77, 101, 1  (nrow, ncol, nlyr)
#resolution  : 1, 1  (x, y)
#extent      : 0, 101, 0, 77  (xmin, xmax, ymin, ymax)
#coord. ref. : Cartesian (Meter) 
#source(s)   : memory
#name        :  prediction 
#min value   :   0.2525767 
#max value   : 254.6400884 
  • Related