I have a table with price lists that specifies the price of an item in a country for a specific time period. The issue with my table is, that there are overlapping price lists for the same item in the same country. What I wish, is to select the overlapping price list with the latest valid from date. Looking at the picture below the highlighted rows would be the desired end-result out of the 7 example rows:
Data sample:
CREATE TABLE #PriceList
(
[Country] VARCHAR(15),
[Price list] VARCHAR(25),
[Item number] INTEGER,
[Price] DECIMAL(10, 2),
[Valid from] INTEGER,
[Valid to] INTEGER
);
INSERT INTO #PriceList
(
Country,
[Price list],
[Item number],
Price,
[Valid from],
[Valid to]
)
VALUES
('United Kingdom', 'Price list Tech', 33333, 20.50, 20200101, 20201231),
('United Kingdom', 'Price list Tech', 33333, 20.50, 20200102, 20201231),
('Germany', 'Price list stuff', 44444, 15.50, 20200102, 20201231),
('Germany', 'Price list stuff', 44444, 15.50, 20200101, 20200630),
('USA', 'Price list textile new', 55555, 32.25, 20200102, 20201231),
('USA', 'Price list textile', 55555, 32.50, 20200101, 20200630),
('Germany', 'Price list stuff', 44444, 15.50, 20210101, 20210630);
What I have tried:
SELECT Country,
[Price list],
[Item number],
[Price],
[Valid from],
[Valid to]
FROM(
SELECT Country,
[Price list],
[Item number],
[Price],
[Valid from],
[Valid to],
ROW_NUMBER() OVER (PARTITION BY a.Country,
a.[Item number],
a.[Valid to]
ORDER BY a.[Valid from] DESC
) AS rn
FROM #PriceList a
) b
WHERE b.rn = 1
OUTPUT:
As you can see there are still overlapping price lists.
If I remove "Valid to" in the partition by clause then row_number will eliminate too many price lists. As you can see it eliminated two of Germany's price lists when it should only have eliminated the overlapping one. Without "Valid to", it strictly just checks for the latest "Valid from" date.
SELECT Country,
[Price list],
[Item number],
[Price],
[Valid from],
[Valid to]
FROM(
SELECT Country,
[Price list],
[Item number],
[Price],
[Valid from],
[Valid to],
ROW_NUMBER() OVER (PARTITION BY a.Country,
a.[Item number]
ORDER BY a.[Valid from] DESC
) AS rn
FROM #PriceList a
) b
WHERE b.rn = 1
All help, tips and ideas are highly appreciated!
CodePudding user response:
Here we use RANK() OVER (PARTITION BY
in a CTE
to identify the rows that we want and then filter in the WHERE
.
I have also modified the column names to replace spaces with an underscore which makes queries much easier to write.
I've added valid_From to the order by
in the Rank()
to eliminate duplicate Valid_To value for United Kingdom.
CREATE TABLE PriceList ( [Country] VARCHAR(15), [Price_list] VARCHAR(25), [Item_number] INTEGER, [Price] DECIMAL(10, 2), [Valid_from] INTEGER, [Valid_to] INTEGER ); INSERT INTO PriceList ( Country, [Price_list], [Item_number], Price, [Valid_from], [Valid_to] ) VALUES ('United Kingdom', 'Price list Tech', 33333, 20.50, 20200101, 20201231), ('United Kingdom', 'Price list Tech', 33333, 20.50, 20200102, 20201231), ('Germany', 'Price list stuff', 44444, 15.50, 20200102, 20201231), ('Germany', 'Price list stuff', 44444, 15.50, 20200101, 20200630), ('USA', 'Price list textile new', 55555, 32.25, 20200102, 20201231), ('USA', 'Price list textile', 55555, 32.50, 20200101, 20200630), ('Germany', 'Price list stuff', 44444, 15.50, 20210101, 20210630); GO
7 rows affected
select * from pricelist; GO
Country | Price_list | Item_number | Price | Valid_from | Valid_to :------------- | :--------------------- | ----------: | ----: | ---------: | -------: United Kingdom | Price list Tech | 33333 | 20.50 | 20200101 | 20201231 United Kingdom | Price list Tech | 33333 | 20.50 | 20200102 | 20201231 Germany | Price list stuff | 44444 | 15.50 | 20200102 | 20201231 Germany | Price list stuff | 44444 | 15.50 | 20200101 | 20200630 USA | Price list textile new | 55555 | 32.25 | 20200102 | 20201231 USA | Price list textile | 55555 | 32.50 | 20200101 | 20200630 Germany | Price list stuff | 44444 | 15.50 | 20210101 | 20210630
WITH prices AS (SELECT Country, Item_Number, Price, Valid_From, Valid_To, RANK() OVER ( PARTITION BY Country,Item_Number ORDER BY Valid_To DESC, Valid_From )ranking FROM PriceList ) SELECT Country, Item_Number, Price, Valid_From, Valid_To FROM prices WHERE ranking = 1; GO
Country | Item_Number | Price | Valid_From | Valid_To :------------- | ----------: | ----: | ---------: | -------: Germany | 44444 | 15.50 | 20210101 | 20210630 United Kingdom | 33333 | 20.50 | 20200101 | 20201231 USA | 55555 | 32.25 | 20200102 | 20201231
db<>fiddle here
CodePudding user response:
try this simplified query with [price list] / Country / Item / year(valid from)
the windowed function FIRST_VALUE()
can do the trick along with distinct
select distinct Country,[Item number]
,[year] = [Valid from]/10000
,[Price List] = FIRST_VALUE([Price list])OVER(partition by Country,[Item number],[Valid from]/10000 order by [Valid from] desc)
,[price] = FIRST_VALUE(Price) OVER(partition by Country,[Item number],[Valid from]/10000 order by [Valid from] desc)
from #PriceList
results:
Country | Item number | year | Price List | price
------- | ----------- | -----| ---------------------- | -----
Germany | 44444 | 2020 | Price list stuff | 15.50
Germany | 44444 | 2021 | Price list stuff | 15.50
United Kingdom | 33333 | 2020 | Price list Tech | 20.50
USA | 55555 | 2020 | Price list textile new | 32.25