Home > Back-end >  How do you mutate a new column based on an existing character column when you need more than 2 outco
How do you mutate a new column based on an existing character column when you need more than 2 outco

Time:10-29

So, basically, from a table such as this with a column containing six different character types`.:

Subject  Name 
         <chr> 
   1      a
   2      b
   3      c
   4      d
   5      e
   6      f
   7      b 
  etc.   etc. 

*7 out of 1000 rows*

I want to create a new column using mutate/similair function to create a new character column based on these six groups in "Name" so that the new table looks like this:

Subject  Name    New column
         <chr>      <chr>
   1      a          Hi
   2      b          Hello
   3      c          Sup
   4      d          Yo
   5      e          Hullo
   6      f          Yosha
   7      b          Hello
  etc.   etc.        etc. 
*7 out of 1000 rows*

I have tried using the if function like this:

mutate("New column" = if(Name %in% "a") {
"Hi"
}  
else if(Name %in% "b"){
"Hello"
}  
else if(Name %in% "c") {
"Sup"
}     
else if(Name %in% "d") {
"Yo"
}    
else if(Name %in% "e") {
"Hullo"
}  
else if(name %in% "f") {
"Yosha"
})  

But I can't quite get it to work. Some help would be really appreciated.

CodePudding user response:

I think the easiest way to do this if you only have a small number of substitutions is:

lookup <- c(a = "Hi", b = "Hello", c = "Sup", d = "Yo", e = "Hullo", f = "Yosha")

df %>% mutate(New_Column = lookup[Name])

#>   Subject Name New_Column
#> 1       1    a         Hi
#> 2       2    b      Hello
#> 3       3    c        Sup
#> 4       4    d         Yo
#> 5       5    e      Hullo
#> 6       6    f      Yosha
#> 7       7    b      Hello

Data (taken from question)

df <- structure(list(Subject = 1:7, Name = c("a", "b", "c", "d", "e", 
                     "f", "b")), class = "data.frame", row.names = c(NA, -7L))

df
#>   Subject Name
#> 1       1    a
#> 2       2    b
#> 3       3    c
#> 4       4    d
#> 5       5    e
#> 6       6    f
#> 7       7    b

CodePudding user response:

You can try case_when() from dplyr. The last row starting with TRUE is to assign a missing value to new_col if name is not a-f.

library(tidyverse)

dat <- data.frame(
  subject = 1:6,
  name = letters[1:6]
)


dat |> 
  mutate(new_col = case_when(
    name == "a" ~ "Hi",
    name == "b" ~ "Hello",
    name == "c" ~ "Sup",
    name == "d" ~ "Yo",
    name == "e" ~ "Hullo",
    name == "f" ~ "Yosha",
    TRUE ~ NA_character_
  ))

#   subject name new_col
#        1    a      Hi
#        2    b   Hello
#        3    c     Sup
#        4    d      Yo
#        5    e   Hullo
#        6    f   Yosha
  •  Tags:  
  • r
  • Related