Home > front end >  Count unique total for combined multiple values
Count unique total for combined multiple values

Time:01-05

I have a dataset that records the products associated with certain accounts. I want to summarise the total number of accounts for a specific set of products, only counting each account number once, no matter how many products they have. So the total for this sample would be 4. (a b c d)

Account Product
a 1
a 2
b 1
c 1
c 2
d 3

The code I have tried so far is

filter(Product == 1 | 2 | 3) %>%
  summarise(total = n_distinct(), .groups = Account)

This gives message Error in summarise_verbose(.groups, caller_env()) : object 'Account' not found

I also tried

filter(Product == 1 | 2 | 3) %>%
  summarise(total = n_distinct(Account))

But this doesn't reduce the number of rows properly - I'm still getting 300,000 rows when I should get 70,000 based on other data I have. Is there a way of counting the (alphanumeric) account numbers once and once only, no matter what the products are?

CodePudding user response:

In the absence of an example of minimal data, I suppose you want to count the different elements by groups and using filters.

Data %>%
filter(Product  %in% c(1,2,3)) %>%
group_by(Account) %>%
  summarise(
    total = n_distinct(Product)
  )

CodePudding user response:

You were close

df %>%
  group_by(Account) %>%
  summarise(
    total = n_distinct(Product)
  )
  •  Tags:  
  • Related