a proper function to convert some factors in a dataset into numeric variables-CodePudding

I have this dataset that includes all factors

questions = data.frame(sex = c(rep("M",30),rep("F",30)),
                       do_you_like_playing_football = rep(c("yes","no","dk"),20),
                       do_you_like_playing_basketball = rep(c("yes","no","dk"),20))

questions = questions%>%mutate_if(is.character,factor)

I want to convert the second and the third variables into numeric variables (yes=1, no=5, dk=8) while preserving the dataset format.

what a proper algorithm to do that ? thanks.

CodePudding user response：

You could do the following:

cols = c("do_you_like_playing_football", "do_you_like_playing_basketball")
lut  = c(yes=1, no=5, dk=8)

questions[cols] = lapply(questions[cols], \(x) lut[x])

Or more concisely with magrittr

library(magrittr)
questions[cols] %<>% lapply(\(x) lut[x])

Result

head(questions)
#   sex do_you_like_playing_football do_you_like_playing_basketball
# 1   M                            8                              8
# 2   M                            5                              5
# 3   M                            1                              1
# 4   M                            8                              8
# 5   M                            5                              5
# 6   M                            1                              1