I have an example df:
df <- data.frame(
group = c("a", "a", "a", "a", "b", "b", "c", "c", "c", "c", "d", "d", "d"),
col1 = c(-36,10,-5,1, 0, 5,10, 5, 20, 2, -1, 1, 2 )
)
group col1
1 a -36
2 a 10
3 a -5
4 a 1
5 b 0
6 b 5
7 c 10
8 c 5
9 c 20
10 c 2
11 d -1
12 d 1
13 d 2
and I want to derive flag such that grouped by 'group', if there is a value of 1 in col1, set flag = Y. If there is not a value of 1 in col1, then set the highest col1 value to flag = Y.
I tried this logic but I don't know how to make it so that if it meets the first condition to not fulfill the second condition in the group:
df <- df %>%
group_by(group) %>%
mutate(flag = case_when(
col1 == 1 ~ "Y",
col1 == max(col1) ~ "Y",
TRUE ~ "")
)
expected output:
group col1 flag
1 a -36
2 a 10
3 a -5
4 a 1 Y
5 b 0
6 b 5 Y
7 c 10
8 c 5
9 c 20 Y
10 c 2
11 d -1
12 d 1 Y
13 d 2
CodePudding user response:
One possible solution:
library(dplyr)
df %>%
group_by(group) %>%
mutate(flag = if(any(col1==1)) ifelse(col1==1, "Y", "")
else ifelse(col1==max(col1), "Y", ""))
# A tibble: 13 x 3
# Groups: group [4]
group col1 flag
<chr> <dbl> <chr>
1 a -36 ""
2 a 10 ""
3 a -5 ""
4 a 1 "Y"
5 b 0 ""
6 b 5 "Y"
7 c 10 ""
8 c 5 ""
9 c 20 "Y"
10 c 2 ""
11 d -1 ""
12 d 1 "Y"
13 d 2 ""
CodePudding user response:
You can try the code below
df %>%
group_by(group) %>%
mutate(flag = ((1:n()) == which.max(2 * (col1 == 1) (col1 == max(col1))))) %>%
ungroup()
which gives
# A tibble: 13 × 3
group col1 flag
<chr> <dbl> <int>
1 a -36 0
2 a 10 0
3 a -5 0
4 a 1 1
5 b 0 0
6 b 5 1
7 c 10 0
8 c 5 0
9 c 20 1
10 c 2 0
11 d -1 0
12 d 1 1
13 d 2 0