Im having trouble finding a parsimonious way to do the following:
I need to count how many times the following is true for an entire row:
“t2”
, x2=4
, x3=0
In the following dataframe this is true for rows 8, 10 and 19. So the answer would be (t2, x2=4, x3=0) = 3 because that iteration happens twice.
x1 x2 x3
1 t2xy 1 0
2 m1xy 3 0
3 m2xy 3 0
4 t1xy 4 1
5 m1yx 3 1
6 m2xy 3 1
7 m2yx 3 0
8 t2yx 4 0
9 t1xy 4 0
10 t2yx 4 0
11 m2yx 1 0
12 m1xy 3 0
13 m2yx 3 0
14 m2xy 1 0
15 t2yx 4 1
16 t2xy 1 1
17 m1xy 2 1
18 t1xy 2 1
19 t2xy 4 0
20 t1yx 2 1
I need to do this for each partial string match: t1, t2, m1, m2 And stored in either their own variables or aggregated somehow. Here is an example of all of the permutations for t1:
(t1, x2=1, x3=0) = 12
(t1, x2=1, x3=1) = 15
(t1, x2=2, x3=0) = 7
(t1, x2=2, x3=1) = 6
(t1, x2=3, x3=0) = 11
(t1, x2=3, x3=1) = 9
(t1, x2=4, x3=0) = 9
(t1, x2=4, x3=1) = 13
(These outputs are just examples and not reflective of the above dataframe)
This would also be done for t2, m1,and m2 permutations.
Here is the code I used to create some fake data:
x1<- sample(c("t1xy", "t2xy", "m1xy", "m2xy","t1yx", "t2yx", "m1yx", "m2yx"), 20, replace = T)
x2<- sample(1:4, 20, replace = T)
x3<- sample(0:1, 20, replace = T)
df_x <- data.frame(x1,x2,x3)
df_x
Thanks in advance!
CodePudding user response:
We could use add_count
with the conditions:
library(dplyr)
library(stringr)
df %>%
add_count(t2 = str_detect(x1, "t2") & x2==4 & x3==0)
x1 x2 x3 t2 n
1 t2xy 1 0 FALSE 17
2 m1xy 3 0 FALSE 17
3 m2xy 3 0 FALSE 17
4 t1xy 4 1 FALSE 17
5 m1yx 3 1 FALSE 17
6 m2xy 3 1 FALSE 17
7 m2yx 3 0 FALSE 17
8 t2yx 4 0 TRUE 3
9 t1xy 4 0 FALSE 17
10 t2yx 4 0 TRUE 3
11 m2yx 1 0 FALSE 17
12 m1xy 3 0 FALSE 17
13 m2yx 3 0 FALSE 17
14 m2xy 1 0 FALSE 17
15 t2yx 4 1 FALSE 17
16 t2xy 1 1 FALSE 17
17 m1xy 2 1 FALSE 17
18 t1xy 2 1 FALSE 17
19 t2xy 4 0 TRUE 3
20 t1yx 2 1 FALSE 17