In this data I would like to create a new data frame with cycle (this is my identifier) column and another two columns one has the sum of my N variable by cycle and another that has sum of N where the element of classifier column is equal to valid column.
set.seed(1)
test <- tibble(cicle = c(rep("1", 10), rep("2", 10), rep("3", 10)),
classifier = rep(c("c1", "c2", "c3"), 10),
valid = rep(c("c3", "c2", "c4"), 10),
N = rnorm(30))
How could I perform it?
Thank you
CodePudding user response:
After grouping by 'cicle', get the sum
of 'N' for the first sum
and the sum
of a subset of 'N' i.e. where classifier
values are same as valid
library(dplyr)
test %>%
group_by(cicle) %>%
summarise(sum1 = sum(N), sum2 = sum(N[classifier == valid]), .groups = 'drop')