I have a dataframe (df) like this. I have 82 SKUs started from M1 to M82.
SKU date sales
M1 2-jan 4
M2 2-jan 5
M1 3-jan 8
M82 3-jan 1
...
M82 31-dec 9
i want to filter each SKU seperate and then group_by(date)
and summarise(sales_perday = sum(sales)
Something like this
for(i in SKU){
SKU_M[i] <- df %>% filter(SKU == SKU_M[i]) %>% group_by(date)
%>% summarise(sales_perday = sum(sales))
Expected output are 82 dataframes with each SKU in 1 dataframe.
I did this below for 1 SKU but i want it for all 82 in an easy way.
M50 <- df %>% filter(SKU == 'M50') %>% group_by(date) %>% summarise(sales_perday = sum(sales))
CodePudding user response:
You probably want to group by multiple columns:
library(tidyverse)
data <- tribble(
~SKU, ~date, ~sales,
"M1", "2-jan",4,
"M2", "2-jan",5,
"M1", "3-jan",8
)
# the cioncise way
data %>%
group_by(SKU, date) %>%
summarise(sales_perday = sum(sales))
#> `summarise()` has grouped output by 'SKU'. You can override using the `.groups`
#> argument.
#> # A tibble: 3 × 3
#> # Groups: SKU [2]
#> SKU date sales_perday
#> <chr> <chr> <dbl>
#> 1 M1 2-jan 4
#> 2 M1 3-jan 8
#> 3 M2 2-jan 5
# if one really want to have multiple data frames
data %>%
group_by(SKU, date) %>%
summarise(sales_perday = sum(sales)) %>%
nest(-SKU) %>%
pull(data)
#> Warning: All elements of `...` must be named.
#> Did you want `data = -SKU`?
#> `summarise()` has grouped output by 'SKU'. You can override using the `.groups`
#> argument.
#> [[1]]
#> # A tibble: 2 × 2
#> date sales_perday
#> <chr> <dbl>
#> 1 2-jan 4
#> 2 3-jan 8
#>
#> [[2]]
#> # A tibble: 1 × 2
#> date sales_perday
#> <chr> <dbl>
#> 1 2-jan 5
Created on 2022-06-08 by the reprex package (v2.0.0)
CodePudding user response:
Another option with split
:
df <- df |>
group_by(date) |>
summarise(sales_perday = sum(sales))
split(df, df$SKU)
CodePudding user response:
If you really do want separate data frames, then after grouping by SKU and date, and then summarizing, use group_split()
to partition by SKU.
library(tidyverse)
df <- tribble(
~SKU, ~date, ~sales,
"M1", "2-jan",4,
"M2", "2-jan",5,
"M1", "3-jan",8
)
df |>
group_by(SKU, date) |>
summarise(sales_perday = sum(sales)) |>
group_split()
#> `summarise()` has grouped output by 'SKU'. You can override using the `.groups`
#> argument.
#> <list_of<
#> tbl_df<
#> SKU : character
#> date : character
#> sales_perday: double
#> >
#> >[2]>
#> [[1]]
#> # A tibble: 2 × 3
#> SKU date sales_perday
#> <chr> <chr> <dbl>
#> 1 M1 2-jan 4
#> 2 M1 3-jan 8
#>
#> [[2]]
#> # A tibble: 1 × 3
#> SKU date sales_perday
#> <chr> <chr> <dbl>
#> 1 M2 2-jan 5