Home > Mobile >  `str_replace_all` numeric values in column according to named vector
`str_replace_all` numeric values in column according to named vector

Time:04-21

I want to use a named vector to map numeric values of a data frame column.

consider the following example:

df <- data.frame(year = seq(2000,2004,1), value = sample(11:15, r = T)) %>% 
    add_row(year=2005, value=1)

df
#   year value
# 1 2000    12
# 2 2001    15
# 3 2002    11
# 4 2003    12
# 5 2004    14
# 6 2005     1

I now want to replace according to a vector, like this one

repl_vec <- c("1"="apple", "11"="radish", "12"="tomato", "13"="cucumber", "14"="eggplant", "15"="carrot")

which I do with this

df %>% mutate(val_alph = str_replace_all(value, repl_vec))

However, this gives:

  #   year value     val_alph
  # 1 2000    11   appleapple
  # 2 2001    13       apple3
  # 3 2002    15       apple5
  # 4 2003    12       apple2
  # 5 2004    14       apple4
  # 6 2005     1        apple

since str_replace_all uses the first match and not the whole match. In the real data, the names of the named vector are also numbers (one- and two-digits).

I expect the output to be like this:

  # year value     val_alph
  # 1 2000    11     radish
  # 2 2001    13   cucumber
  # 3 2002    15     carrot
  # 4 2003    12     tomato
  # 5 2004    14   eggplant
  # 6 2005     1      apple

Does someone have a clever way of achieving this?

CodePudding user response:

I would use base R's match instead of string matching here, since you are looking for exact whole string matches.

df %>%
 mutate(value = repl_vec[match(value, names(repl_vec))])
#>   year    value
#> 1 2000   radish
#> 2 2001   carrot
#> 3 2002   carrot
#> 4 2003 cucumber
#> 5 2004 eggplant
#> 6 2005    apple

Created on 2022-04-20 by the reprex package (v2.0.1)

CodePudding user response:

Is this what you want to do?

set.seed(1234)
df <- data.frame(year = seq(2000,2004,1), value = sample(11:15, r = T)) %>% 
  add_row(year=2005, value=1)

repl_vec <- c("1"="one", "11"="eleven", "12"="twelve", "13"="thirteen", "14"="fourteen", "15"="fifteen")
names(repl_vec) <- paste0("\\b", names(repl_vec), "\\b")

df %>%
  mutate(val_alph = str_replace_all(value, repl_vec, names(repl_vec)))

which gives:

  year value val_alph
1 2000    14 fourteen
2 2001    12   twelve
3 2002    15  fifteen
4 2003    14 fourteen
5 2004    11   eleven
6 2005     1      one

  • Related