Home > Net >  How to count data frame elements grouped by multiple conditions in dplyr?
How to count data frame elements grouped by multiple conditions in dplyr?

Time:09-12

I am trying to use dplyr to count elements grouped by multiple conditions (columns) in a data frame. In the below example (dataframe output is at the top (except that I manually inserted the 2 right-most columns to explain what I am trying to do), and R code is underneath), I am trying to count the joint groupings of the Element and Group columns. My multiple condition grouping attempt is eleGrpCnt. Any recommendations for the correct way to do this in dplyr? I thought that group_by a combined (Element, Group) would work.

                                                   desired
       Element Group origOrder eleCnt eleGrpCnt  eleGrpCnt   explanation
       <chr>   <dbl>     <int>  <int>     <int>  <comment>   <comment>
     1 B           0         1      1         1          1   1st grouping of B where Group = 0
     2 R           0         2      1         1          1   1st grouping of R where Group = 0
     3 R           1         3      2         1          2   2nd grouping of R where Group = 1
     4 R           1         4      3         2          2   2nd grouping of R where Group = 1 
     5 B           0         5      2         2          1   1st grouping of B where Group = 0 
     6 X           2         6      1         1          1   1st grouping of X where Group = 2 
     7 X           2         7      2         2          1   1st grouping of X where Group = 2 
     8 X           0         8      3         1          2   2nd grouping of X where Group = 0
     9 X           0         9      4         2          2   2nd grouping of X where Group = 0
    10 X          -1        10      5         1          3   3rd grouping of X where Group = -1 
    
library(dplyr)

myData6 <- 
  data.frame(
    Element = c("B","R","R","R","B","X","X","X","X","X"),
    Group = c(0,0,1,1,0,2,2,0,0,-1)
  )

myData6 %>% 
  mutate(origOrder = row_number()) %>%
  group_by(Element) %>%
    mutate(eleCnt = row_number()) %>%
  ungroup() %>%
  group_by(Element, Group) %>%
    mutate(eleGrpCnt = row_number())%>%
  ungroup()

CodePudding user response:

If you group by element then the numbers you are looking for are simply the matches of Group against the unique values of Group:

library(dplyr)

myData6 %>% 
  mutate(origOrder = row_number()) %>%
  group_by(Element) %>%
  mutate(eleCnt = row_number()) %>%
  ungroup() %>%
  group_by(Element) %>%
  mutate(eleGrpCnt = match(Group, unique(Group)))
#> # A tibble: 10 x 5
#> # Groups:   Element [3]
#>    Element Group origOrder eleCnt eleGrpCnt
#>    <chr>   <dbl>     <int>  <int>     <dbl>
#>  1 B           0         1      1         1
#>  2 R           0         2      1         1
#>  3 R           1         3      2         2
#>  4 R           1         4      3         2
#>  5 B           0         5      2         1
#>  6 X           2         6      1         1
#>  7 X           2         7      2         1
#>  8 X           0         8      3         2
#>  9 X           0         9      4         2
#> 10 X          -1        10      5         3

Created on 2022-09-11 with reprex v2.0.2

CodePudding user response:

Here's one approach; I'm sorting by Group value but if you want to change the order to match original appearance order we could add a step.

myData6 %>% 
  mutate(origOrder = row_number()) %>%
  group_by(Element) %>%
  mutate(eleCnt = row_number()) %>%
  ungroup() %>%
  arrange(Element, Group) %>%
  group_by(Element) %>%
  mutate(eleGrpCnt = cumsum(Group != lag(Group, default = -999))) %>%
  ungroup() %>%
  arrange(origOrder)


# A tibble: 10 × 5
   Element Group origOrder eleCnt eleGrpCnt
   <chr>   <dbl>     <int>  <int>     <int>
 1 B           0         1      1         1
 2 R           0         2      1         1
 3 R           1         3      2         2
 4 R           1         4      3         2
 5 B           0         5      2         1
 6 X           2         6      1         3
 7 X           2         7      2         3
 8 X           0         8      3         2
 9 X           0         9      4         2
10 X          -1        10      5         1
  • Related