Home > Mobile >  How to find first time a price has changed in SQL
How to find first time a price has changed in SQL

Time:11-04

I have a table that contains an item ID, the date and the price. All items show their price for each day, but I want only to select the items that have not had their price change, and to show the days without change.

An example of the table is

id Price Day Month Year
asdf 10 03 11 2022
asdr1 8 03 11 2022
asdf 10 02 11 2022
asdr1 8 02 11 2022
asdf 10 01 11 2022
asdr1 7 01 11 2022
asdf 9 31 10 2022
asdr1 8 31 10 2022
asdf 8 31 10 2022
asdr1 8 31 10 2022

The output I want is:

Date id Last_Price First_Price_Appearance DaysWOchange
2022-11-03 asdf 10 2022-11-01 2
2022-11-03 asdr1 8 2022-11-02 1

The solutions needs to run quickly, so how are some efficency intensive ways to solve this, considering that the table has millions of rows, and there are items that have not changed their price in years.

The issue for efficiency comes because for each id, I would need to loop the entire table, looking for the first match in which the price has changed, and repeat this for thousands of items.

I am attempting to calculate the difference between the current last price, and all the history, but these becomes slow to process, and may take several minutes to calculate for all of history. The main concern for this problem is efficiency.

CodePudding user response:

DECLARE @table TABLE (id NVARCHAR(5), Price INT, Date DATE)
INSERT INTO @table (id, Price, Date) VALUES
('asdf', 10,    '2022-10-20'),  
('asdr1', 8,    '2022-10-15'),
('asdf', 10,    '2022-11-03'),  
('asdr1', 8,    '2022-11-02'),  
('asdf', 10,    '2022-11-02'),  
('asdr1', 8,    '2022-11-02'),  
('asdf', 10,    '2022-11-01'),  
('asdr1', 7,    '2022-11-01'),  
('asdf',  9,    '2022-10-31'),  
('asdr1', 8,    '2022-10-31'),  
('asdf',  8,    '2022-10-31'),  
('asdr1', 8,    '2022-10-31')

Tables of data are useful, but it's even more so if you can put the demo date into an object.

SELECT id, FirstDate, LastChange, DaysSinceChange, Price
  FROM (
        SELECT id, MIN(Date) OVER (PARTITION BY id ORDER BY Date) AS FirstDate, Date AS LastChange, Price, 
               CASE WHEN LEAD(Date,1) OVER (PARTITION BY id ORDER BY Date) IS NULL THEN DATEDIFF(DAY,Date,CURRENT_TIMESTAMP) 
                    ELSE DATEDIFF(DAY,LAG(Date) OVER (PARTITION BY id ORDER BY Date),Date)
               END AS DaysSinceChange, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) AS rn
               
          FROM @table
       ) a
 WHERE rn = 1

This is a quick way to get what you want. If you execute the subquery by itself you can see all the history.

id      FirstDate   LastChange  Price   DaysSinceChange
-------------------------------------------------------
asdf    2022-10-20  2022-11-03  10      0           
asdr1   2022-10-15  2022-11-02  8       1           
        SELECT id, MIN(Date) OVER (PARTITION BY id ORDER BY Date) AS FirstDate, Date AS LastChange, Price,
               CASE WHEN LEAD(Date,1) OVER (PARTITION BY id ORDER BY Date) IS NULL THEN DATEDIFF(DAY,Date,CURRENT_TIMESTAMP) 
                    ELSE DATEDIFF(DAY,LAG(Date) OVER (PARTITION BY id ORDER BY Date),Date)
               END AS DaysSinceChange, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) AS rn
               
          FROM @table
id      FirstDate   LastChange  Price   DaysSinceChange rn
------------------------------------------------------
asdf    2022-10-20  2022-11-03  10      0               1
asdf    2022-10-20  2022-11-02  10      1               2
asdf    2022-10-20  2022-11-01  10      1               3
asdf    2022-10-20  2022-10-31  9       11              4
asdf    2022-10-20  2022-10-31  8       0               5
asdf    2022-10-20  2022-10-20  10      NULL            6
asdr1   2022-10-15  2022-11-02  8       1               1
asdr1   2022-10-15  2022-11-02  8       1               2
asdr1   2022-10-15  2022-11-01  7       1               3
asdr1   2022-10-15  2022-10-31  8       16              4
asdr1   2022-10-15  2022-10-31  8       0               5
asdr1   2022-10-15  2022-10-15  8       NULL            6

CodePudding user response:

You can use lag() and a cumulative max():

select id, date, price
from (select t.*,
             max(case when price <> lag_price then date end) over (partition by id) as price_change_date
      from (select t.*, lag(price) over (partition by id order by date) as lag_price
            from t
           ) t
     ) t
where price_change_date is null;

This calculates the first date of a price change for each id. It then filters out all rows where a price change occurred. The use of window functions should be highly efficient, taking advantage of indexes on (id, date) and (id, price, date).

  • Related