Filtering every positive value for every negative in R-CodePudding

I have a dataset with financial data. Sometimes, a product gets refunded, resulting in a negative count of the product (so the money gets returned). I want to conditionally filter these rows out of the dataset. Example:

library(tidyverse)
set.seed(1)

df <- tibble(
  count = sample(c(-1,1),80,replace = TRUE,prob=c(.2,.8)),
  id = rep(1:4,20)
)

df %>% 
  group_by(id) %>%
  summarize(total = sum(count))

# A tibble: 4 x 2
     id total
  <int> <dbl>
1     1    10
2     2    14
3     3    16
4     4    10

id = 1 has 15 positive counts and 5 negatives. (15 - 5= 10). I want to keep 10 values in df with id = 1 with the positive values. id = 2 has 17 positive counts and 3 negatives. (17- 3 = 14). I want to keep 14 values in df with id = 2 with the positive values.

In the end, this condition should be True nrow(df) == sum(df$count)

Unfortunately, a filtering join such as anti_join() will remove all the rows. For some reason I cannot think of another option to filter the tibble.

Thanks for helping me!

CodePudding user response：

You can "uncount" using the total column to get the number of repeats of each row.

df %>% 
  group_by(id) %>%
  summarize(total = sum(count)) %>%
  uncount(total) %>%
  mutate(count = 1)
#> # A tibble: 50 x 2
#>       id count
#>    <int> <dbl>
#>  1     1     1
#>  2     1     1
#>  3     1     1
#>  4     1     1
#>  5     1     1
#>  6     1     1
#>  7     1     1
#>  8     1     1
#>  9     1     1
#> 10     1     1
#> # ... with 40 more rows

^{Created on 2022-10-21 with reprex v2.0.2}