Home > Net >  R: count distinct values in tibble/df [duplicate]
R: count distinct values in tibble/df [duplicate]

Time:09-21

I have the following sample data.frame:

n <- 100
dates <- as.Date(c("2021-01-01", "2021-01-02", "2021-01-03", "2021-01-04"))

df <- data.frame( date = sample(dates, n, replace = TRUE),
                  user = sample(LETTERS, n, replace = TRUE)
                 )

For each date it was registered which users (A-Z) were doing phone calls. If there's no entry for a specific user at a specific date, there was no call. Users can do several phone calls a day.

What I want to know is, how many different users were doing phone calls on each day? For example I'd like to have a table like this:

date        number_of_users_doing_phone_calls
2021-01-01                                 10
2021-01-02                                 16
2021-01-03                                 26
2021-01-04                                 20

CodePudding user response:

a dplyr solution

library(dplyr)

df %>% 
  group_by(date) %>% 
  summarise(number_of_users_doing_phone_calls = n_distinct(user))
  • Related