Home > front end >  Ifelse condition across multiple variables using paste0 function to call up variables
Ifelse condition across multiple variables using paste0 function to call up variables

Time:12-01

I want to use an ifelse condition across multiple variables using paste0("Var",c(1,3,5)) to call up variables. Here is some data.

set.seed(123)
df <- data.frame(Var1 = sample(1:5,10,replace = T),
                 Var2 = sample(1:5,10,replace = T),
                 Var3 = sample(1:5,10,replace = T),
                 Var4 = sample(1:5,10,replace = T),
                 Var5 = sample(1:5,10,replace = T))
 df
   Var1 Var2 Var3 Var4 Var5
1     3    5    2    1    5
2     3    3    1    1    5
3     2    3    3    2    4
4     2    1    4    3    5
5     3    4    1    4    2
6     5    1    3    5    1
7     4    1    5    5    1
8     1    5    4    3    3
9     2    3    2    1    1
10    3    2    5    2    5

As an example, i'm interested in Var1, Var3, Var5. Using ifelse, if the value is equal to 4,5 the new variable is given value of 1, else 0. I'm using code to get the variables I'm interested in.

paste0( "Var", c(1,3,5)  ) 
[1] "Var1" "Var3" "Var5"

I tried both of these, and know this doesn't work but is it possible to write a code that is similar to this

newvar <- ifelse( paste0("Var", c(1,3,5)) %in% c(4,5) , 1, 0) 
newvar <- ifelse( df[ , paste0( "Var", c(1:3)  )  ] %in% c(4,5) , 1, 0) 

Any help greatly appreciated. Thanks

EDITED : Apologies if wasn't clear. I was making the ifelse across multiple variables to create a single variable. "newvar" is single variable, I didnte want to create the ifelse for each variable newvar1, newvar3, newvar5. It is any value of 4 or 5 in any of those variables, newvar is 1 or 0. Thanks

CodePudding user response:

Use sapply:

v <- paste0("Var", c(1, 3, 5)) # or v <- c(1, 3, 5)
cbind(df, new =  sapply(df[v], `%in%`, 4:5))

giving:

   Var1 Var2 Var3 Var4 Var5 new.Var1 new.Var3 new.Var5
1     3    5    2    1    5        0        0        1
2     3    3    1    1    5        0        0        1
3     2    3    3    2    4        0        0        1
4     2    1    4    3    5        0        1        1
5     3    4    1    4    2        0        0        0
6     5    1    3    5    1        1        0        0
7     4    1    5    5    1        1        1        0
8     1    5    4    3    3        0        1        0
9     2    3    2    1    1        0        0        0
10    3    2    5    2    5        0        1        1

Added

Regarding the comment below this answer try any of these. They all use v defined in the next line except for the last solution.

v <- paste0("Var", c(1, 3, 5)) # or v <- c(1, 3, 5)
transform(df, newvar = do.call("pmax", lapply(df[v], `%in%`, 4:5)))

transform(df, newvar = apply(df[v] == 4 | df[v] == 5, 1, max))

transform(df, newvar = apply(df[v], 1, function(x)  any(x %in% 4:5)))

transform(df, newvar = pmax(Var1 %in% 4:5, Var3 %in% 4:5, Var5 %in% 4:5))

any of which give:

   Var1 Var2 Var3 Var4 Var5 newvar
1     3    5    2    1    5      1
2     3    3    1    1    5      1
3     2    3    3    2    4      1
4     2    1    4    3    5      1
5     3    4    1    4    2      0
6     5    1    3    5    1      1
7     4    1    5    5    1      1
8     1    5    4    3    3      1
9     2    3    2    1    1      0
10    3    2    5    2    5      1

CodePudding user response:

Here's a solution based on the tidyverse.

library(tidyverse)

df %>% 
  mutate(
    across(
      c(Var1, Var3, Var5), 
      ~ifelse(.x %in% c(4, 5), 1, 0), 
      .names="new{.col}"
    )
  )
   Var1 Var2 Var3 Var4 Var5 newVar1 newVar3 newVar5
1     3    5    2    1    5       0       0       1
2     3    3    1    1    5       0       0       1
3     2    3    3    2    4       0       0       1
4     2    1    4    3    5       0       1       1
5     3    4    1    4    2       0       0       0
6     5    1    3    5    1       1       0       0
7     4    1    5    5    1       1       1       0
8     1    5    4    3    3       0       1       0
9     2    3    2    1    1       0       0       0
10    3    2    5    2    5       0       1       1

The across function runs the function defined in its second argument on each of the columns defined by its first argument. The .names argument is optional and provides names for the new columns (as opposed to overwriting the originals). {.col} is a palceholder for the name of the current column.

CodePudding user response:

Here's a base Rsolution close to what OP tried:

df$newVar <- sapply(df[,c(1,3,5)], function(x) ifelse(x == 4|x == 5, 1, 0))

or, even closer:

df$newVar <- sapply(df[,c(1,3,5)], function(x) ifelse(x %in% c(4,5), 1, 0))

CodePudding user response:

If you want to use both paste0() and ifelse(), here is an example:

x <- paste0("Var", c(1,3,5))
newvar <- matrix(NA, nrow(df1), length(x))

for(j in 1:(length(x))){
  newvar[,j] <- ifelse(df1[ ,x[j]] %in% c(4,5), 1, 0)}

final.df <- data.frame(df1, 
                       newvar=ifelse(apply(newvar, 1, sum)>0, 1, 0
))
final.df

   Var1 Var2 Var3 Var4 Var5 newvar
1     3    5    2    1    5      1
2     3    3    1    1    5      1
3     2    3    3    2    4      1
4     2    1    4    3    5      1
5     3    4    1    4    2      0
6     5    1    3    5    1      1
7     4    1    5    5    1      1
8     1    5    4    3    3      1
9     2    3    2    1    1      0
10    3    2    5    2    5      1
  •  Tags:  
  • r
  • Related