I want to create a column called visit_occurrance that sums the number of times each person_id
reappears in the dataset. For example,
> dput(df)
structure(list(Person_ID = c(123L, 123L, 110L, 145L, 345L, 345L,
345L, 345L, 300L, 234L, 234L, 111L, 110L)), class = "data.frame", row.names = c(NA,
-13L))
Desired output:
> dput(df)
structure(list(Person_ID = c(123L, 123L, 110L, 145L, 345L, 345L,
345L, 345L, 300L, 234L, 234L, 111L, 110L), Visit_occurrance = c(1L,
2L, 1L, 1L, 1L, 2L, 3L, 4L, 1L, 1L, 2L, 1L, 2L)), class = "data.frame", row.names = c(NA,
-13L))
CodePudding user response:
library(dplyr)
-13L))
df %>%
group_by(Person_ID) %>%
mutate(Visit_occurrance = row_number())
Person_ID Visit_occurrance
<int> <int>
1 123 1
2 123 2
3 110 1
4 145 1
5 345 1
6 345 2
7 345 3
8 345 4
9 300 1
10 234 1
11 234 2
12 111 1
13 110 2