Hope you can help me out. Have a dataset df with 3 columns - personID, operationID, and a row number.
Want to make a new column(row_intend) which should count based on both personID and operationID.
In the example the fourth row shows how I would like the outcome. Grouped by record_id, it should count based on operationID.
person_id <- c("1", "1", "1", "2", "2", "2", "2", "3", "3")
operation_id <- c("60533", "60533", "60534", "50677", "50678", "50678", "50679", "78322", "78322")
row <- c("1", "2", "3", "4", "5", "6", "7", "8", "9")
row_intend <- c("1", "1", "2", "1", "2", "2", "3", "1", "1")
df <- data.frame(person_id, operation_id, row, row_intend)
df
I would like the output to be:
person_id operation_id row row_intend
1 1 60533 1 1
2 1 60533 2 1
3 1 60534 3 2
4 2 50677 4 1
5 2 50678 5 2
6 2 50678 6 2
7 2 50679 7 3
8 3 78322 8 1
9 3 78322 9 1
I tried with group_by and mutate(row=rownumber). But this does not take into account my second condition (count ALSO based on operation ID)
CodePudding user response:
We may group by 'person_id' and get the match
between the operation_id and its unique
values
library(dplyr)
df <-df %>%
group_by(person_id) %>%
mutate(row_intend2 = match(operation_id, unique(operation_id))) %>%
ungroup
-output
> df
# A tibble: 9 × 5
person_id operation_id row row_intend row_intend2
<chr> <chr> <chr> <chr> <int>
1 1 60533 1 1 1
2 1 60533 2 1 1
3 1 60534 3 2 2
4 2 50677 4 1 1
5 2 50678 5 2 2
6 2 50678 6 2 2
7 2 50679 7 3 3
8 3 78322 8 1 1
9 3 78322 9 1 1