Home > Software engineering >  Need help counting occurences with respect to multiple columns with multiple conditions (including a
Need help counting occurences with respect to multiple columns with multiple conditions (including a

Time:10-17

Im having trouble finding a parsimonious way to do the following:

I need to count how many times the following is true for an entire row: “t2”, x2=4, x3=0

In the following dataframe this is true for rows 8, 10 and 19. So the answer would be (t2, x2=4, x3=0) = 3 because that iteration happens twice.

     x1 x2 x3
1  t2xy  1  0
2  m1xy  3  0
3  m2xy  3  0
4  t1xy  4  1
5  m1yx  3  1
6  m2xy  3  1
7  m2yx  3  0
8  t2yx  4  0
9  t1xy  4  0
10 t2yx  4  0
11 m2yx  1  0
12 m1xy  3  0
13 m2yx  3  0
14 m2xy  1  0
15 t2yx  4  1
16 t2xy  1  1
17 m1xy  2  1
18 t1xy  2  1
19 t2xy  4  0
20 t1yx  2  1

I need to do this for each partial string match: t1, t2, m1, m2 And stored in either their own variables or aggregated somehow. Here is an example of all of the permutations for t1:

(t1, x2=1, x3=0) = 12
(t1, x2=1, x3=1) = 15
(t1, x2=2, x3=0) = 7
(t1, x2=2, x3=1) = 6
(t1, x2=3, x3=0) = 11
(t1, x2=3, x3=1) = 9
(t1, x2=4, x3=0) = 9
(t1, x2=4, x3=1) = 13

(These outputs are just examples and not reflective of the above dataframe)

This would also be done for t2, m1,and m2 permutations.

Here is the code I used to create some fake data:

x1<- sample(c("t1xy", "t2xy", "m1xy", "m2xy","t1yx", "t2yx", "m1yx", "m2yx"), 20, replace = T)
x2<- sample(1:4, 20, replace = T)
x3<- sample(0:1, 20, replace = T)

df_x <- data.frame(x1,x2,x3)
df_x

Thanks in advance!

CodePudding user response:

We could use add_count with the conditions:

library(dplyr)
library(stringr)
df %>% 
  add_count(t2 = str_detect(x1, "t2") & x2==4 & x3==0)
     x1 x2 x3    t2  n
1  t2xy  1  0 FALSE 17
2  m1xy  3  0 FALSE 17
3  m2xy  3  0 FALSE 17
4  t1xy  4  1 FALSE 17
5  m1yx  3  1 FALSE 17
6  m2xy  3  1 FALSE 17
7  m2yx  3  0 FALSE 17
8  t2yx  4  0  TRUE  3
9  t1xy  4  0 FALSE 17
10 t2yx  4  0  TRUE  3
11 m2yx  1  0 FALSE 17
12 m1xy  3  0 FALSE 17
13 m2yx  3  0 FALSE 17
14 m2xy  1  0 FALSE 17
15 t2yx  4  1 FALSE 17
16 t2xy  1  1 FALSE 17
17 m1xy  2  1 FALSE 17
18 t1xy  2  1 FALSE 17
19 t2xy  4  0  TRUE  3
20 t1yx  2  1 FALSE 17
  • Related