Home > Software design >  check if value exists in group, set this value, if value does not exist, set another value
check if value exists in group, set this value, if value does not exist, set another value

Time:11-04

I have an example df:

df <- data.frame(
  group = c("a", "a", "a", "a", "b", "b", "c", "c", "c", "c", "d", "d", "d"),
  col1 = c(-36,10,-5,1, 0, 5,10, 5, 20, 2, -1, 1, 2 )
)
   group col1
1      a  -36
2      a   10
3      a   -5
4      a    1
5      b    0
6      b    5
7      c   10
8      c    5
9      c   20
10     c    2
11     d   -1
12     d    1
13     d    2

and I want to derive flag such that grouped by 'group', if there is a value of 1 in col1, set flag = Y. If there is not a value of 1 in col1, then set the highest col1 value to flag = Y.

I tried this logic but I don't know how to make it so that if it meets the first condition to not fulfill the second condition in the group:

df <- df %>%
  group_by(group) %>%
mutate(flag = case_when(
  col1 == 1 ~ "Y",
  col1 == max(col1)  ~ "Y",
  TRUE ~ "")
) 

expected output:

   group col1 flag
1      a  -36     
2      a   10     
3      a   -5     
4      a    1    Y
5      b    0     
6      b    5    Y
7      c   10     
8      c    5     
9      c   20    Y
10     c    2     
11     d   -1     
12     d    1    Y
13     d    2     

CodePudding user response:

One possible solution:

library(dplyr)

df %>%
  group_by(group) %>%
  mutate(flag = if(any(col1==1)) ifelse(col1==1, "Y", "") 
                else ifelse(col1==max(col1), "Y", ""))

# A tibble: 13 x 3
# Groups:   group [4]
   group  col1 flag 
   <chr> <dbl> <chr>
 1 a       -36 ""   
 2 a        10 ""   
 3 a        -5 ""   
 4 a         1 "Y"  
 5 b         0 ""   
 6 b         5 "Y"  
 7 c        10 ""   
 8 c         5 ""   
 9 c        20 "Y"  
10 c         2 ""   
11 d        -1 ""   
12 d         1 "Y"  
13 d         2 ""

CodePudding user response:

You can try the code below

df %>%
  group_by(group) %>%
  mutate(flag =  ((1:n()) == which.max(2 * (col1 == 1)   (col1 == max(col1))))) %>%
  ungroup()

which gives

# A tibble: 13 × 3
   group  col1  flag
   <chr> <dbl> <int>
 1 a       -36     0
 2 a        10     0
 3 a        -5     0
 4 a         1     1
 5 b         0     0
 6 b         5     1
 7 c        10     0
 8 c         5     0
 9 c        20     1
10 c         2     0
11 d        -1     0
12 d         1     1
13 d         2     0
  • Related