Ok, I’m stuck on an SQL query and tried long enough that it’s time to ask for help :) I'm using Objection.js – but that's not super relevant as I really just can't figure out how to structure the SQL.
I have the following example data set:
Items
id | name |
---|---|
1 | Test 1 |
2 | Test 2 |
3 | Test 3 |
Listings
id | item_id | price | created_at |
---|---|---|---|
1 | 1 | 100 | 1654640000 |
2 | 1 | 60 | 1654640001 |
3 | 1 | 80 | 1654640002 |
4 | 2 | 90 | 1654640003 |
5 | 2 | 90 | 1654640004 |
6 | 3 | 50 | 1654640005 |
What I’m trying to do:
- Return the lowest priced listing for each item
- If all listings for an item have the same price, I want to return the newest of the two items
- Overall, I want to return the resulting items by price
I’m trying to write a query that returns the data:
id | item_id | name | price | created_at |
---|---|---|---|---|
6 | 3 | Test 3 | 50 | 1654640005 |
2 | 1 | Test 1 | 60 | 1654640001 |
5 | 2 | Test 2 | 90 | 1654640004 |
Any help would be greatly appreciated! I'm also starting fresh, so I can add new columns to the data if that would help at all :)
An example of where my query is right now:
select * from "listings" inner join (select "item_id", MIN(price) as "min_price" from "listings" group by "item_id") as "grouped_listings" on "listings"."item_id" = "grouped_listings"."item_id" and "listings"."price" = "grouped_listings"."min_price" where "listings"."sold_at" is null and "listings"."expires_at" > ? order by CAST(price AS DECIMAL) ASC limit ?;
This gets me listings – but if two listings have the same price, it returns multiple listings with the same item_id – not ideal.
CodePudding user response:
Aggregation functions, as the MIN
function you employed in your query, is a viable option, yet if you want to have an efficient query for your problem, window functions can be your best friends. This class of functions allow to compute values over "windows" (partitions) of your table given some specified columns.
For the solution to this problem I'm going to compute two values using the window functions:
- the minimum value for "listings.price", by partitioning on "listings.item_id",
- the maximum value for "created_at", by partitioning on "listings.item_id" and listings.price
SELECT *,
MIN(price) OVER(PARTITION BY item_id) AS min_price,
MAX(created_at) OVER(PARTITION BY item_id, price) AS max_created_at
FROM listings
Once you have all records of listings associated to the corresponding minimum price and latest date, it's necessary for you to select the records whose
- price equals the minimum price
- created_at equals the most recent created_at
WITH cte AS (
SELECT *,
MIN(price) OVER(PARTITION BY item_id) AS min_price,
MAX(created_at) OVER(PARTITION BY item_id, price) AS max_created_at
FROM listings
)
SELECT id,
item_id,
price,
created_at
FROM cte
WHERE price = min_price
AND created_at = max_created_at
If you need to order by price, it's sufficient to add a ORDER BY price
clause.
Check the demo here.
CodePudding user response:
Given the postgresql tag, this should work:
with listings_numbered as (
select *, row_number() over (
partition by item_id
order by price asc, created_at desc
) as rownum
from listings
)
select l.id, l.item_id, i.name, l.price, l.created_at
from listings_numbered l
join items i on l.item_id=i.id
where l.rownum=1
order by price asc;
This is a bit of an advanced query, using window functions and a common table expression, but we can break it down.
with listings_numbered as (...) select
simply means to run the query inside of the ...
, and then we can refer to the results of that query as listings_numbered
inside of the select, as though it was a table.
We're selecting all of the columns in listings
, plus one more:
row_number() over (partition by item_id order by price asc, created_at desc)
. partition by item_id
means that we would like the row number to reset for each new item_id
, and the order by
specifies the ordering that the rows should get within each partition before we number them: first increasing by price, then decreasing by creation time to break ties.
The result of the CTE listings_numbered
looks like:
id | item_id | price | created_at | rownum |
---|---|---|---|---|
2 | 1 | 60 | 1654640001 | 1 |
3 | 1 | 80 | 1654640002 | 2 |
1 | 1 | 100 | 1654640000 | 3 |
5 | 2 | 90 | 1654640004 | 1 |
4 | 2 | 90 | 1654640003 | 2 |
6 | 3 | 50 | 1654640005 | 1 |
If you look at only the rows where rownum
(the last column) is 1, then you can see that it's exactly the set of listings that you're interested in.
The outer query then selects from this this dataset, joins on items
to get the name, filters to only the listings where rownum
is 1, and sorts by price, to get the final result:
id | item_id | name | price | created_at |
---|---|---|---|---|
6 | 3 | Test 3 | 50 | 1654640005 |
2 | 1 | Test 1 | 60 | 1654640001 |
5 | 2 | Test 2 | 90 | 1654640004 |