Home > Mobile >  Create a new variable based in conditions in R
Create a new variable based in conditions in R

Time:02-17

I am trying to create (mutate) a new variable in a database but I am having some problems. My idea is to create a variable that replicates the result of another variable in the database conditioned on the value of a previous variable.

Let me show an example:

Database
var1var2var3    var4
1   45  AH67    A456
1   78  GH98    D788
5   46  GD94    M747
5   98  GF21    G589
10  47  GD09    I989
10  54  KG32    U456
15  48  FS89    C191
15  66  GF23    Y198

My idea is:

  • If var1=1, new=var2
  • If var1=5, new=var3
  • If var1=10, new=NA.
  • If var1=15, new=var4

Expected result:

var1var2var3    var4    new
1   45  AH67    A456    45
1   78  GH98    D788    78
5   46  GD94    M747    GD94
5   98  GF21    G589    GD94
10  47  GD09    I989    na
10  54  KG32    U456    na
15  48  FS89    C191    C191
15  66  GF23    Y198    Y198

I tried this but it didn’t work out.

data<-mutate(data,
             ifelse(var1==1){New=var2},
             ifelse(var1==5){New=var3},
             ifelse(var1==10){New=NA},
             ifelse(var1==15){New=var4})

Thanks in advance,

CodePudding user response:

The syntax of mutate is mutate(.data, new_col = something). Also, since you have multiple condition based on var1, you'll need to have a nested ifelse. Therefore, the correct way of doing it is:

library(dplyr)

df %>% mutate(new = ifelse(var1 == 1, 
                           var2, 
                           ifelse(var1 == 5, 
                                  var3, 
                                  ifelse(var1 == 10, 
                                         NA, 
                                         ifelse(var1 == 15, 
                                                var4, 
                                                NA)))))

Or use dplyr::case_when, which is way clearer in the syntax:

df %>% mutate(new = case_when(var1 == 1 ~ as.character(var2),
                              var1 == 5 ~ var3,
                              var1 == 15 ~ var4))

Output

The output of the above two methods are the same.

  var1 var2 var3 var4  new
1    1   45 AH67 A456   45
2    1   78 GH98 D788   78
3    5   46 GD94 M747 GD94
4    5   98 GF21 G589 GF21
5   10   47 GD09 I989 <NA>
6   10   54 KG32 U456 <NA>
7   15   48 FS89 C191 C191
8   15   66 GF23 Y198 Y198

CodePudding user response:

library(dplyr)
data %>%  
  mutate(new = case_when(var1 == 1 ~ as.character(var2),
                         var1 == 5 ~ var3,
                         var1 == 10 ~ as.character(NA),
                         var1 == 15 ~ var4))

Output

# A tibble: 8 x 5
   var1  var2 var3  var4  new  
  <int> <int> <chr> <chr> <chr>
1     1    45 AH67  A456  45   
2     1    78 GH98  D788  78   
3     5    46 GD94  M747  GD94 
4     5    98 GF21  G589  GF21 
5    10    47 GD09  I989  NA   
6    10    54 KG32  U456  NA   
7    15    48 FS89  C191  C191 
8    15    66 GF23  Y198  Y198 
  • Related