Home > Software engineering >  sum select rows based on condition tidyverse
sum select rows based on condition tidyverse

Time:09-27

I can't find an answer to this exact question. Using tidyverse, I would like to sum the values of rows that contain a value >= 1 into a new row. I have:

df <- tibble(dif = c(0, 1, 2, 3), n = c(2, 4, 6, 8))

    dif     n
  <dbl> <dbl>
1     0     2
2     1     4
3     2     6
4     3     8

In this case, rows 2:4 contain a value greater than 0. I would like to sum the n values of rows 2:4 into a new row, preferably retaining the dif value, giving an output like this.

    dif     n
  <dbl> <dbl>
1     0     2
2     1    18

CodePudding user response:

You can group_by whether dif >= 1; coerces logical vectors (TRUE, FALSE) to 1 and 0.

library(dplyr)
df %>% 
  group_by(dif =  (dif >= 1)) %>% 
  summarise(n = sum(n))

output

    dif     n
1     0     2
2     1    18

CodePudding user response:

We can also use a hack with if_any(), in an answer similar to @Maël 's:

library(dplyr)
df %>% 
  group_by(dif = if_any(dif)) %>% 
  summarise(n = sum(n))
  • Related