Home > Back-end >  re-express Categorical Field values using R
re-express Categorical Field values using R

Time:11-05

I have a dataset with a column called education. The education column has several names. I want to replace those names with numerical number. Once I am done with that, I go to see the new column in the dataset which gives me NA.

Here is my attempt:

library(plyr)                 #Load plyr package 

edu.num <- revalue(x = bank_train$education,replace = 
                     c("illiterate" = 0,
                       "basic.4y" = 4,
                       "basic.6y" = 6,
                       "basic.9y" = 9,
                       "high.school" = 12,
                       "professional.course" = 12,
                       "university.degree" = 16,
                       "unknown" = NA))
bank_train$education_numeric <- as.numeric(levels(edu.num))[edu.num]


enter image description here

CodePudding user response:

revalue function doesn't returns a factor object, but a character vector. So levels(edu.num) returns "NULL", since levels function is adapted to factors.

So you should just modify this last line of the code

library(plyr)#Load plyr package 

edu.num <- revalue(x = bank_train$education,replace = 
                 c("illiterate" = 0,
                   "basic.4y" = 4,
                   "basic.6y" = 6,
                   "basic.9y" = 9,
                   "high.school" = 12,
                   "professional.course" = 12,
                   "university.degree" = 16,
                   "unknown" = NA))
bank_train$education_numeric <- as.numeric(edu.num)
  • Related