Home > Enterprise >  How to tally occurrences of a value in a column in R?
How to tally occurrences of a value in a column in R?

Time:11-05

I have the following data:

df <- data.frame(id = c("1", "1", "1", "1", "2", "2", "2"), 
                 x = c(0, 1, 0, 1, 0, 1, 0))

id x
 1 0
 1 1
 1 0
 1 1
 2 0
 2 1
 2 0

I want to tally the occurences of 1 in x. The desired output:

id x count
 1 0     0
 1 1     1
 1 0     0
 1 1     2
 2 0     0
 2 1     1
 2 0     0

How do I do this? A dplyr solution is preferred.

CodePudding user response:

You may take cumulative sum of x values and keep the result to be 0 where value of x is 0 for each id.

library(dplyr)

df %>%
  mutate(count = replace(cumsum(x), x == 0, 0), .by = id)

#  id x count
#1  1 0     0
#2  1 1     1
#3  1 0     0
#4  1 1     2
#5  2 0     0
#6  2 1     1
#7  2 0     0

CodePudding user response:

library(tidyverse)

df <- tibble(id = c("1", "1", "1", "1", "2", "2", "2"), 
                 x = c(0, 1, 0, 1, 0, 1, 0))

df |> mutate(
  cumsum = cumsum(x),
  tally = if_else(x == 0, 0, cumsum),
  .by = id
  )
#> # A tibble: 7 × 4
#>   id        x cumsum tally
#>   <chr> <dbl>  <dbl> <dbl>
#> 1 1         0      0     0
#> 2 1         1      1     1
#> 3 1         0      1     0
#> 4 1         1      2     2
#> 5 2         0      0     0
#> 6 2         1      1     1
#> 7 2         0      1     0

Created on 2023-11-04 with reprex v2.0.2

CodePudding user response:

Base R

> transform(df, count=ave(x, id, FUN=\(x) replace(cumsum(x), x == 0, 0)))
   id x count
1:  1 0     0
2:  1 1     1
3:  1 0     0
4:  1 1     2
5:  2 0     0
6:  2 1     1
7:  2 0     0

data.table

> library(data.table)
> setDT(df)[, count := replace(cumsum(x), x == 0, 0), id]
> df
   id x count
1:  1 0     0
2:  1 1     1
3:  1 0     0
4:  1 1     2
5:  2 0     0
6:  2 1     1
7:  2 0     0

> dput(df)
structure(list(x = c(0, 1, 0, 1, 0, 1, 0), count = c(0, 1, 0, 
2, 0, 1, 0)), row.names = c(NA, -7L), class = "data.frame")

CodePudding user response:

You can try x*cumsum(x) and group by id, e.g.,

df %>%
    mutate(count = x * cumsum(x), .by = id)

which gives

  id x count
1  1 0     0
2  1 1     1
3  1 0     0
4  1 1     2
5  2 0     0
6  2 1     1
7  2 0     0
  • Related