Get the rate of change by finding the change in price-CodePudding

I am fairly new to programming. I am working on my portfolio, and am looking at a dataset regarding the price of food from distribution centers to a grocery store. What I am looking at is a set of data with the price, item, and date of transaction. What I am looking for is to find the rate of change from the distribution center to the store, and when it happened.

Note: the price of the item changes from the distribution center.

Here is an example of what I am looking at:

Date	Item	Price	Kg
01.02.2022	Apple	$1.00	1
02.02.2022	Meat	$4.00	1
03.02.2022	Fish	$3.00	1
03.02.2022	Bread	$1.00	1
15.02.2022	Meat	$5.00	1
15.02.2022	Meat	$3.00	1
16.02.2022	Apple	$2.00	1
20.02.2022	Fish	$3.00	1
25.02.2022	Apple	$0.50	1

As you can see, the price for the same quantity for the same product changes randomly over time. What I would like to analyse is:

The rate of change per item
When the change occured

This is the ideal outcome:

item	kg	1st_price	1st_price_date	2nd_price	2nd_price_date	amount_of_change
Apple	1	$1.00	01.02.2022	$2.00	16.02.2022	$1.00
Meat	1	$4.00	02.02.2022	$5.00	15.02.2022	$1.00
Bread	1	$1.00	03.02.2022	N/A	N/A	N/A
Fish	1	$3.00	03.02.2022	$3.00	20.02.2022	$0.00

#Continuing the table below. These columns would go to the right of the columns above. #Unfortunetly, StackOverflow was not able to create a table with everything together. #total_change is for the entire period

item	3rd_price	3rd_price_date	amount_of_change	change_duration_period	total_change
Apple	$0.50	25.02.2022	-$1.50	01.02.2022-25.02.2022	-$0.50
Meat	$3.00	15.02.2022	-$2.00	02.02.2022-1502.2022	-$1.00
Bread	N/A	N/A	N/A	03.02.2022-03.02-2022	$0.00
Fish	$3.00	20.02.2022	$0.00	03.02.2022-20.02.2022	$0.00

As you can see, some items can have more price changes per month than others depending on the item. Some items have drastic changes, some have no changes at all.

Presuming there are over 14,000 unique items what would you recommend to gather the data an place them in a table as seen in the "Ideal outcome" section?

I am still new to programming, please don't be too harsh!

Thanks!

CodePudding user response：

Something like this?

library(tidyverse)

df %>%
  mutate(Date = as.Date(Date, format = "%d.%m.%Y"),
         Price = parse_number(Price)) %>%
  group_by(Item) %>%
  arrange(Date) %>%
  mutate(appearance = row_number(),
         change = Price - lag(Price)) %>%
  ungroup() %>%
  pivot_wider(names_from = appearance, 
              values_from = c(Date, Price, change),
              names_vary = "slowest")

Result

# A tibble: 4 × 11
  Item     Kg Date_1     Price_1 change_1 Date_2     Price_2 change_2 Date_3     Price_3 change_3
  <chr> <int> <date>       <dbl>    <dbl> <date>       <dbl>    <dbl> <date>       <dbl>    <dbl>
1 Apple     1 2022-02-01       1       NA 2022-02-16       2        1 2022-02-25     0.5     -1.5
2 Meat      1 2022-02-02       4       NA 2022-02-15       5        1 2022-02-15     3       -2  
3 Fish      1 2022-02-03       3       NA 2022-02-20       3        0 NA            NA        0  
4 Bread     1 2022-02-03       1       NA NA              NA        0 NA            NA        0

Source data

df <- data.frame(
  stringsAsFactors = FALSE,
              Date = c("01.02.2022","02.02.2022",
                       "03.02.2022","03.02.2022","15.02.2022","15.02.2022",
                       "16.02.2022","20.02.2022","25.02.2022"),
              Item = c("Apple","Meat","Fish",
                       "Bread","Meat","Meat","Apple","Fish","Apple"),
             Price = c("$1.00","$4.00","$3.00",
                       "$1.00","$5.00","$3.00","$2.00","$3.00","$0.50"),
                Kg = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)
)