Type Date Cost
Shampoo 01/31/2022 $10
Shampoo 01/31/2022 $15
Shampoo 02/22/2019 $15
Conditioner 03/15/2020 $17
Conditioner 05/16/2022 $19
Soap. 01/31/2021 $5
Soap 01/06/2022 $2
Soap 12/31/2019 $3
Soap 10/10/2022 $5
How would I approach summing total cost for specific items in a year, months, quarter and total cost
Example Output:
Type | Number Items | Year | Total Cost
Shampoo | 2 | 2022 | 25
Shampoo | 1. | 2019 | 15
etc...
split by month, and quarter
Trying summarize and library(lubridate)
CodePudding user response:
library(tidyverse)
library(lubridate)
df %>%
group_by(Type, Date)%>%
summarise(Number_Items = n(),
Year = year(mdy(Date[1])),
Total_Cost = sum(parse_number(Cost)),
.groups = 'drop')
# A tibble: 8 × 5
Type Date Number_Items Year Total_Cost
<chr> <chr> <int> <dbl> <dbl>
1 Conditioner 03/15/2020 1 2020 17
2 Conditioner 05/16/2022 1 2022 19
3 Shampoo 01/31/2022 2 2022 25
4 Shampoo 02/22/2019 1 2019 15
5 Soap 01/06/2022 1 2022 2
6 Soap 10/10/2022 1 2022 5
7 Soap 12/31/2019 1 2019 3
8 Soap. 01/31/2021 1 2021 5
CodePudding user response:
You have to group your data by type and year, and then count the number of items and total cost, here an example for you to adapt.
library(dplyr)
library(lubridate)
your_data_frame %>%
group_by(type, year = year(dmy(Date))) %>%
summarise(
number_of_items = n(),
total_cost = sum(cost,na.rm = TRUE)
)
CodePudding user response:
This one is very similar to @onyambu's solution. But it differs in the grouping:
library(dplyr)
library(readr) # parse_number()
library(lubridate
df %>%
mutate(Year = year(mdy(Date))) %>%
group_by(Year, Type) %>%
summarise(`Number Items` = n(),
`Total Cost` = sum(parse_number(Cost)))
Year Type `Number Items` `Total Cost`
<dbl> <chr> <int> <dbl>
1 2019 Shampoo 1 15
2 2019 Soap 1 3
3 2020 Conditioner 1 17
4 2021 Soap. 1 5
5 2022 Conditioner 1 19
6 2022 Shampoo 2 25
7 2022 Soap 2 7