Home > database >  sapply returns a matrix instead of character vector
sapply returns a matrix instead of character vector

Time:12-02

I have the following data:

df_1 <- data.frame(var =  c("A_new_1",  "B_new_2",  "A_old_1",  "B_old_2"), 
code = 001, desc = c('applied', 'not applied', 'applied','applied'))
> df_1
      var code        desc
1 A_new_1    1     applied
2 B_new_2    1 not applied
3 A_old_1    1     applied
4 B_old_2    1     applied

I would like to replace A_new_1 with 1_new_A, and so on. I want to use dplyr and replace all at once, this what I tried:

p <- list(c('1_new_A'), c('2_new_B'), c('1_old_A'), c('2_old_B'))

ptr <- list(c('A_new_1'), c('B_new_2'), c('A_old_1'), c('B_old_2'))

df_1 %>% mutate(var  = (sapply(1:4, function(i){ gsub(ptr[[i]], p[[i]], var)})))

this is what I get:

    var.1   var.2   var.3   var.4 code        desc
1 1_new_A A_new_1 A_new_1 A_new_1    1     applied
2 B_new_2 B_new_2 B_new_2 B_new_2    1 not applied
3 A_old_1 A_old_1 A_old_1 A_old_1    1     applied
4 B_old_2 B_old_2 B_old_2 2_old_B    1     applied

My questions:

  1. why sapply returning a matrix?
  2. how do I correct above solution and return a single character vector (var) in the input dataframe?

CodePudding user response:

This might be simpler. First fix your example:

p <- c('1_new_A', '2_new_B', '1_old_A', '2_old_B')
ptr <- c('A_new_1', 'B_new_2', 'A_old_1', 'B_old_2') # Fixing typos
df_1$var <- p[match(df_1$var, ptr)]
df_1
#       var code        desc
# 1 1_new_A    1     applied
# 2 2_new_B    1 not applied
# 3 1_old_A    1     applied
# 4 2_old_B    1     applied

Or in dpyr:

df_1 %>% mutate(var=p[match(var, ptr)])
#       var code        desc
# 1 1_new_A    1     applied
# 2 2_new_B    1 not applied
# 3 1_old_A    1     applied
# 4 2_old_B    1     applied

CodePudding user response:

A better option using dplyr is probably the recode() function. For example

translate <- setNames(unlist(p), unlist(ptr))
df_1 %>% mutate(var  = recode(var, !!!translate ))

Basically you just need to create a named list for the translations and then inject that into the recode().

sapply() is returning a matrix because you are looping over the values 1:4 so a value will be return for each numner 1 to 4, and you are using gsub() with var and in the mutate() the value of var is the entire column of values. So you are returning multiple values for each call to gsub so it gets transformed into a matrix.

  • Related