Home > Blockchain >  Counting the number of products per date and get values per product in column
Counting the number of products per date and get values per product in column

Time:12-09

I have a table where the rows are transactions and columns are products bought in that transaction.

An example:

Date Transaction product 1 product 2 product 3
1-1 1 Apple Banana
1-1 2 Orange
1-1 3 Apple Orange
1-2 4
1-2 5 Apple Banana
1-3 6 orange Apple Banana

I want to know how many products were sold on what day. So that I will get a table as such:

Date Apple Banana Orange
1-1 2 1 2
1-2 1 1
1-3 1 1 1

The challenge is, however, that there are a lot of products. Hence I won't be able to manually type them out in order for them to count.

Is anyone able to help me out? Thanks in advance!

CodePudding user response:

library(tidyverse)

data <- tribble(
  ~Date, ~Transaction, ~product.1, ~product.2, ~product.3,
  "1-1",           1L,    "Apple",   "Banana",         NA,
  "1-1",           2L,   "Orange",         NA,         NA,
  "1-1",           3L,    "Apple",   "Orange",         NA,
  "1-2",           4L,         NA,         NA,         NA,
  "1-2",           5L,    "Apple",   "Banana",         NA,
  "1-3",           6L,   "Orange",    "Apple",   "Banana"
  )

data %>%
  pivot_longer(-c(Date, Transaction)) %>%
  count(Date, value) %>%
  pivot_wider(names_from = value, values_from = n, values_fill = list(n = 0))
#> # A tibble: 3 x 5
#>   Date  Apple Banana Orange  `NA`
#>   <chr> <int>  <int>  <int> <int>
#> 1 1-1       2      1      2     4
#> 2 1-2       1      1      0     4
#> 3 1-3       1      1      1     0

Created on 2021-12-08 by the reprex package (v2.0.1)

CodePudding user response:

We could use base R with table - unlist the 'product' columns to a vector while replicating the 'Date' by the number of 'product' columns and use table

table(rep(df1$Date, 3), unlist(df1[3:5]))
     
      Apple Banana Orange
  1-1     2      1      2
  1-2     1      1      0
  1-3     1      1      1

CodePudding user response:

Here is a version with pivoting: @akrun: I tried values_fill = list(n = 0) but gives Error: Can't convert to .

df %>% 
  select(-Transaction) %>% 
  pivot_longer(
    -Date
  ) %>% 
  na.omit() %>% 
  group_by(Date) %>% 
  add_count(value) %>% 
  select(-name) %>% 
  pivot_wider(
    names_from = value,
    values_from = n, 
    values_fn = list
  ) %>% 
  unnest(cols = c(Apple, Banana, Orange)) %>% 
  distinct() 
  Date  Apple Banana Orange
  <chr> <int>  <int>  <int>
1 1-1       2      1      2
2 1-2       1      1     NA
3 1-3       1      1      1
  •  Tags:  
  • r
  • Related