I want to calculate the mean and standard deviation for the number of dates (or visits) that people have. Sample data are:
id date
1 2015-02-23
1 2015-04-24
2 2018-05-23
2 2022-12-05
2 2022-12-06
3 2021-05-21
ID1 has 2 visits (evidenced by 2 dates), ID2 has 3 visits, and ID3 has 1 visit, so the mean would be (2 3 1)/3 =2.
Does anyone know how to calculate the mean and standard deviation? I tried to do this using the summarize function, but I can't get it to work.
CodePudding user response:
You can use rle
function with your id
- data
df < - structure(list(id = c(1L, 1L, 2L, 2L, 2L, 3L), date = c("2015-02-23",
"2015-04-24", "2018-05-23", "2022-12-05", "2022-12-06", "2021-05-21"
)), class = "data.frame", row.names = c(NA, -6L))
mean(rle(df$id)$lengths)