Home > Software design >  How to count groupings of elements in base R or dplyr?
How to count groupings of elements in base R or dplyr?

Time:09-14

I am trying to count a rather complicated series of element groupings, using base R or dplyr.

Suppose we start with the output data frame myDF shown below. I would like to add a column "grpCnt" as manually inserted below and explained to the right. Please any suggestions for how to do this? I have fiddled around with seq() and dplyr::dense_rank() but I imagine there is a straightforward way to do this sort of thing.

> myDF
  Element Code  grpCnt  # grpCnt explanation
1       R  1.0     1.0  # first incidence of R so use Code value
2       R  2.1     2.1  # since there is a decimal value use Code value (2 integer means 2nd incidence of R, .1 decimal value means this Element has been previously grouped and is first in that group)
3       X  1.0     1.0  # first incidence of X so use Code value
4       X  2.1     2.1  # since there is a decimal value in Code use Code value
5       X  2.2     2.2  # since there is a decimal value in Code use Code value
6       X  3.1     3.1  # since there is a decimal value in Code use Code value
7       X  3.2     3.2  # since there is a decimal value in Code use Code value
8       X  5.0     4.0  # no decimal value in Code value; since X's above show prefix sequence of 1,2,2,3,3, continue this X with 4 (not 5)
9       B  1.0     1.0  # first incidence of B so use Code value

myDF <- data.frame(
  Element = c('R','R','X','X','X','X','X','X','B'),
  Code = c(1,2.1,1,2.1,2.2,3.1,3.2,5,1)
)

Output of myDF without above comments (code to generate this output is immediately above):

> myDF
  Element Code
1       R  1.0
2       R  2.1
3       X  1.0
4       X  2.1
5       X  2.2
6       X  3.1
7       X  3.2
8       X  5.0
9       B  1.0

CodePudding user response:

library(dplyr)
myDF %>%
  group_by(Element) %>%
  mutate(grpCnt = if_else(row_number() == 1 | Code %% 1 > 0, 
    Code, floor(lag(Code))   1)) %>%
  ungroup()


# A tibble: 9 × 3
  Element  Code grpCnt
  <chr>   <dbl>  <dbl>
1 R         1      1  
2 R         2.1    2.1
3 X         1      1  
4 X         2.1    2.1
5 X         2.2    2.2
6 X         3.1    3.1
7 X         3.2    3.2
8 X         5      4  
9 B         1      1 

CodePudding user response:

The below also seems to work, and I've run it through several scenarios fine. Please recommend any improvements as I'm an R neophyte.

library(dplyr)
myDF %>% group_by(Element) %>%
   mutate(eleGrpCnt = match(trunc(Code), unique(trunc(Code))))%>%
   mutate(eleGrpCnt = (eleGrpCnt   round(Code%%1 * 10,0)/10))

Output:

# Groups:   Element [3]
  Element  Code eleGrpCnt
  <chr>   <dbl>     <dbl>
1 R         1         1  
2 R         2.1       2.1
3 X         1         1  
4 X         2.1       2.1
5 X         2.2       2.2
6 X         3.1       3.1
7 X         3.2       3.2
8 X         5         4  
9 B         1         1  
  • Related