Example data:
example_data <-
data.frame(value = c(1,3,4,6,7,8,4,6,9,0),
group = c("Not applicable",
"Large group",
"Large group",
"Not applicable",
"Group of 1",
"Large group",
"Large group",
"Large group",
"Group of 1",
"Not applicable"))
I would like to assign group numbers, starting with 1, to groups (both "Large group" and "Group of 1"), and zeroes to "Not applicable" values, using dplyr
.
There can be more than one Not applicable
value in a row. Group of 1
alway contains one row. Large group
may contain any number of rows.
Desired output:
value group group_number
1 1 Not applicable 0
2 3 Large group 1
3 4 Large group 1
4 6 Not applicable 0
5 7 Group of 1 2
6 8 Large group 3
7 4 Large group 3
8 6 Large group 3
9 9 Group of 1 4
10 0 Not applicable 0
I tried this solution from the answers to my previous question:
example_data %>%
mutate(group_number = with(rle(group != "Not applicable"),
rep(cumsum(values) * values, lengths)))
And got
value group group_number
1 1 Not applicable 0
2 3 Large group 1
3 4 Large group 1
4 6 Not applicable 0
5 7 Group of 1 2
6 8 Large group 2
7 4 Large group 2
8 6 Large group 2
9 9 Group of 1 2
10 0 Not applicable 0
I would like to get separate numbers for Large group
and Group of 1
.
CodePudding user response:
example_data %>%
mutate(gr = data.table::rleid(group)* (group != 'Not applicable'),
gr = dense_rank(gr) - 1) # or even gr = as.numeric(factor(gr)) - 1
value group gr
1 1 Not applicable 0
2 3 Large group 1
3 4 Large group 1
4 6 Not applicable 0
5 7 Group of 1 2
6 8 Large group 3
7 4 Large group 3
8 6 Large group 3
9 9 Group of 1 4
10 0 Not applicable 0