I am trying to count a rather complicated series of element groupings, using base R or dplyr.
Suppose we start with the output data frame myDF
shown below. I would like to add a column "grpCnt" as manually inserted below and explained to the right. Please any suggestions for how to do this? I have fiddled around with seq()
and dplyr::dense_rank()
but I imagine there is a straightforward way to do this sort of thing.
> myDF
Element Code grpCnt # grpCnt explanation
1 R 1.0 1.0 # first incidence of R so use Code value
2 R 2.1 2.1 # since there is a decimal value use Code value (2 integer means 2nd incidence of R, .1 decimal value means this Element has been previously grouped and is first in that group)
3 X 1.0 1.0 # first incidence of X so use Code value
4 X 2.1 2.1 # since there is a decimal value in Code use Code value
5 X 2.2 2.2 # since there is a decimal value in Code use Code value
6 X 3.1 3.1 # since there is a decimal value in Code use Code value
7 X 3.2 3.2 # since there is a decimal value in Code use Code value
8 X 5.0 4.0 # no decimal value in Code value; since X's above show prefix sequence of 1,2,2,3,3, continue this X with 4 (not 5)
9 B 1.0 1.0 # first incidence of B so use Code value
myDF <- data.frame(
Element = c('R','R','X','X','X','X','X','X','B'),
Code = c(1,2.1,1,2.1,2.2,3.1,3.2,5,1)
)
Output of myDF
without above comments (code to generate this output is immediately above):
> myDF
Element Code
1 R 1.0
2 R 2.1
3 X 1.0
4 X 2.1
5 X 2.2
6 X 3.1
7 X 3.2
8 X 5.0
9 B 1.0
CodePudding user response:
library(dplyr)
myDF %>%
group_by(Element) %>%
mutate(grpCnt = if_else(row_number() == 1 | Code %% 1 > 0,
Code, floor(lag(Code)) 1)) %>%
ungroup()
# A tibble: 9 × 3
Element Code grpCnt
<chr> <dbl> <dbl>
1 R 1 1
2 R 2.1 2.1
3 X 1 1
4 X 2.1 2.1
5 X 2.2 2.2
6 X 3.1 3.1
7 X 3.2 3.2
8 X 5 4
9 B 1 1
CodePudding user response:
The below also seems to work, and I've run it through several scenarios fine. Please recommend any improvements as I'm an R neophyte.
library(dplyr)
myDF %>% group_by(Element) %>%
mutate(eleGrpCnt = match(trunc(Code), unique(trunc(Code))))%>%
mutate(eleGrpCnt = (eleGrpCnt round(Code%%1 * 10,0)/10))
Output:
# Groups: Element [3]
Element Code eleGrpCnt
<chr> <dbl> <dbl>
1 R 1 1
2 R 2.1 2.1
3 X 1 1
4 X 2.1 2.1
5 X 2.2 2.2
6 X 3.1 3.1
7 X 3.2 3.2
8 X 5 4
9 B 1 1