Home > Enterprise >  Is there a way to link certain counts to id numbers in R
Is there a way to link certain counts to id numbers in R

Time:11-03

I have data on customers, their orders and products they ordered and I want to determine the return rate per customer. With count I can determine how many times each customer id has returned (1) and did not return (0) their item. However this is in a separate tibble. I want to include these return rates (number of times returned (1))/(total orders (1 0)) and link these to the customer id. How do I do this? The following is the code I used to get the customer id and returns, I just need to know how to link these back into the dataset.

returnrate <- data4 %>%
  count(customer_id, returnyesno)
returnrate

with the following partial output

   customer_id returnyesno     n
         <dbl>       <dbl> <int>
 1      212807           0     2
 2     2108514           0     2
 3     2593268           0     2
 4     3155787           0     1
 5     5769078           0     4
 6     6059185           0     6
 7     6059185           1     8

CodePudding user response:

Is this what you are looking for?

library(tidyverse)

df <- returnrate %>%
  group_by(customer_id) %>%
  mutate(return_ratio = sum(returnyesno * n)/ sum(n))

df
    
# A tibble: 7 x 4
# Groups:   customer_id [6]
  customer_id returnyesno     n return_ratio
        <int>       <int> <int>        <dbl>
1      212807           0     2        0    
2     2108514           0     2        0    
3     2593268           0     2        0    
4     3155787           0     1        0    
5     5769078           0     4        0    
6     6059185           0     6        0.571
7     6059185           1     8        0.571

Then if you want to add this to the data4 table, you can run the following code:

data4 <- left_join(data4, df %>% select(customer_id, return_ratio) %>% unique())
  •  Tags:  
  • r
  • Related