Home > database >  How to group by then create a new subset in which the values are conditionally based on the each gro
How to group by then create a new subset in which the values are conditionally based on the each gro

Time:03-11

I have a simple data frame like this

df <- data.frame(x=c(1,1,3,3,2,2,2,1),
                 y=c('a','b','a','b','e','a','d','c'))

enter image description here

I want to group by x, create a new data frame of 2 columns: 'x' and 'test'. The value of 'test' will be based on conditions:

  • If in each group, if the first row has y == 'a' and then if 'c' appears in the list of values of y, then 'test' = 1 else 0

  • If in each group, if the first row has y == 'e' and then if 'd' appears in the list of values of y, then 'test' = 1 else 0

So the expected outcome would be as below

enter image description here

Thank you very much.

CodePudding user response:

df %>%
  group_by(x) %>%
  summarise(test = (first(y) == "a" && any(y == "c") || (first(y) == "e" && any(y == "d"))) * 1L)

CodePudding user response:

library(dplyr)
library(stringr)

df |> 
  group_by(x) |> 
  mutate(test = (row_number() == 1 & y == "a" & sum(str_detect(y, "c"))) | 
           (row_number() == 1 & y == "e" & sum(str_detect(y, "d")))) |> 
  summarize(test = sum(test))
  • Related