Sumarize values by group and conditioned to other two columns-CodePudding

In this data I would like to create a new data frame with cycle (this is my identifier) column and another two columns one has the sum of my N variable by cycle and another that has sum of N where the element of classifier column is equal to valid column.

set.seed(1)
test <- tibble(cicle = c(rep("1", 10), rep("2", 10), rep("3", 10)),
           classifier = rep(c("c1", "c2", "c3"), 10),
           valid = rep(c("c3", "c2", "c4"), 10),
           N = rnorm(30))

How could I perform it?

Thank you

CodePudding user response：

After grouping by 'cicle', get the sum of 'N' for the first sum and the sum of a subset of 'N' i.e. where classifier values are same as valid

library(dplyr)
test %>% 
    group_by(cicle) %>%
    summarise(sum1 = sum(N), sum2 = sum(N[classifier == valid]), .groups = 'drop')