I have the following data:
df_1 <- data.frame(var = c("A_new_1", "B_new_2", "A_old_1", "B_old_2"),
code = 001, desc = c('applied', 'not applied', 'applied','applied'))
> df_1
var code desc
1 A_new_1 1 applied
2 B_new_2 1 not applied
3 A_old_1 1 applied
4 B_old_2 1 applied
I would like to replace A_new_1 with 1_new_A, and so on. I want to use dplyr and replace all at once, this what I tried:
p <- list(c('1_new_A'), c('2_new_B'), c('1_old_A'), c('2_old_B'))
ptr <- list(c('A_new_1'), c('B_new_2'), c('A_old_1'), c('B_old_2'))
df_1 %>% mutate(var = (sapply(1:4, function(i){ gsub(ptr[[i]], p[[i]], var)})))
this is what I get:
var.1 var.2 var.3 var.4 code desc
1 1_new_A A_new_1 A_new_1 A_new_1 1 applied
2 B_new_2 B_new_2 B_new_2 B_new_2 1 not applied
3 A_old_1 A_old_1 A_old_1 A_old_1 1 applied
4 B_old_2 B_old_2 B_old_2 2_old_B 1 applied
My questions:
- why sapply returning a matrix?
- how do I correct above solution and return a single character vector (
var
) in the input dataframe?
CodePudding user response:
This might be simpler. First fix your example:
p <- c('1_new_A', '2_new_B', '1_old_A', '2_old_B')
ptr <- c('A_new_1', 'B_new_2', 'A_old_1', 'B_old_2') # Fixing typos
df_1$var <- p[match(df_1$var, ptr)]
df_1
# var code desc
# 1 1_new_A 1 applied
# 2 2_new_B 1 not applied
# 3 1_old_A 1 applied
# 4 2_old_B 1 applied
Or in dpyr:
df_1 %>% mutate(var=p[match(var, ptr)])
# var code desc
# 1 1_new_A 1 applied
# 2 2_new_B 1 not applied
# 3 1_old_A 1 applied
# 4 2_old_B 1 applied
CodePudding user response:
A better option using dplyr
is probably the recode()
function. For example
translate <- setNames(unlist(p), unlist(ptr))
df_1 %>% mutate(var = recode(var, !!!translate ))
Basically you just need to create a named list for the translations and then inject that into the recode()
.
sapply()
is returning a matrix because you are looping over the values 1:4
so a value will be return for each numner 1 to 4, and you are using gsub()
with var
and in the mutate()
the value of var
is the entire column of values. So you are returning multiple values for each call to gsub
so it gets transformed into a matrix.