Home > OS >  Case when by group in Mutate in R
Case when by group in Mutate in R

Time:11-11

I am new to R. so i need help with this transformation. I have two tables:

x1 <- c(7, 4, 4, 9, 2, 5, 8)
x2 <- c(5, 2, 8, 9, 1, 3, 2)
x3 <- c(6, 2, 3, 4, 2, 2, 7)
objid <- c(1, 2, 3, 4, 4, 2, 4)
data_1 <- data.frame(objid, x1, x2, x3)

and a second table:

x1_r <- c(1.54,0.23, 1.32, 11.66)
x2_r <- c(1.14,1.23, 9.32, 1.26)
x3_r <- c(1.58,0.23, 7.32, 7.66)
objid <- c(1, 2, 3, 4)
data_2 <- data.frame(objid, x1_r, x2_r, x3_r)

What am trying to do in R is do a CASE by group "objid" something like this:

CASE WHEN [x1] <=[x1_r] THEN 1 ELSE 0
CASE WHEN [x2] <=[x2_r] THEN 1 ELSE 0
CASE WHEN [x3] <=[x3_r] THEN 1 ELSE 0

And generate new columns with results in data_2:

objid x1 x2 x3 x1_r_fin x2_r_fin x3_r_fin
1 7 5 6 0 0 0
2 4 2 2 0 0 0
3 4 8 3 0 0 1
4 9 9 4 1 0 1
4 2 1 2 1 1 1
2 5 3 2 0 0 0
4 8 2 7 1 0 1

In mutate i have applied this method:

df %>% mutate_at(vars(-matches("objid")), list(Dif = ~ . - x1))

For simple subtraction and generated new columns with new name on df. I want to do the same with the aforementioned above, but have no clue how, or is there a better and more efficient method. Thanks for your help!

CodePudding user response:

Here is an approach using dplyr case_when:

library(tidyverse)
data_1 %>% inner_join(data_2, by='objid') %>% # join data_1 and data_2 by objid
mutate(x1r_fin = case_when(x1 <= x1_r ~ 1, 
                           TRUE~  0 )) %>% 
mutate(x2r_fin = case_when(x2 <= x2_r ~ 1, 
                           TRUE~  0 )) %>% 
mutate(x3r_fin = case_when(x3 <= x3_r ~ 1, 
                           TRUE~  0 )) %>%
select(-c(x1_r, x2_r, x3_r))

For univariate conditions, ifelse() is also quite readable:

merge(data_1, data_2, by='objid') -> data_1
data_1$x1_r_fin <- ifelse(data_1$x1 <= data_1$x1_r, 1, 0)
data_1$x2_r_fin <- ifelse(data_1$x2 <= data_1$x2_r, 1, 0)
data_1$x3_r_fin <- ifelse(data_1$x3 <= data_1$x3_r, 1, 0)
data_1$x1_r <- NULL
data_1$x2_r <- NULL
data_1$x3_r <- NULL

You can also use ifelse with mutate() a la:

... %>% mutate(x1r_fin = ifelse(x1 <= x1_r, 1, 0)) %>% ...

CodePudding user response:

you can use ifelse function from base R. there are different numbers of raws in two datasets that's why I merged them at first.

new_data = merge(data_1, data_2, all =  T)

then with using ifelse function, you can create new variables

new_data$new_variable1 = ifelse(new_data$x1 < new_data$x1_r, 1, 0)

also, you can add a string in ifelse function

new_data$new_variable2 = ifelse(new_data$x1 < new_data$x1_r, "group2", 0)

then if you want to extract variables from new_data

new_data = new_data %>% select(-5,-6,7) 

or like @NovaEthos mentioned in comments, you can use case_when function

new_data$new_variable1  = case_when(new_data$x1 < new_data$x1_r ~ "group 1")

however, after that, you should eliminate NA's.

  •  Tags:  
  • r
  • Related