Home > database >  Two metric calculation after summarizing
Two metric calculation after summarizing

Time:09-14

In the below data I want to see two metric for each week:

  1. how many unique household - individual combination are there
  2. how many unique household - individual combination spent on that week

enter image description here

For week 1 there are 4 unique household - individual combination of whom 3 had spent and in week 2, 3 unique combination of whom 2 were spent

week total present
 1     4       3
 2     3       2

Below is what I did but not sure how to add the 2nd calculation here

data %>%
  group_by(week) %>%
  summarise(total = n_distinct(household, individual)) 

Here is the Sample dataset:

library(dplyr)   

data <- data.frame(week = c(1,1,1,1,1,2,2,2,2),
                   household = c(1001,1001,1001,1001,1002,1001,1001,1001,1001),
                   individual = c(1,1,2,3,2,1,2,2,3),
                   Spent = c(20,30,90,NA,30,40,50,10,NA))

CodePudding user response:

You could use n_distinct() to count the number of unique combinations in a set of vectors.

data %>%
  group_by(week) %>%
  summarise(total = n_distinct(household, individual),
            present = n_distinct(cbind(household, individual)[!is.na(Spent), ]))

# # A tibble: 2 × 3
#    week total present
#   <dbl> <int>   <int>
# 1     1     4       3
# 2     2     3       2
  • Related