Home > Software engineering >  How to get columns which are not in GROUP BY?
How to get columns which are not in GROUP BY?

Time:10-02

I have a Postgresql database where I have these two tables.

shipping_method

id | name     | abbrev
--- ---------- -------
1  | Standard | ST
2  | Express  | EX

shipping_details:

id shipping_method_id estimated_time_min estimated_time_max price
2 1 02:00:00 04:00:00 230
3 2 00:03:00 01:00:00 500
4 1 02:00:00 04:00:00 1230
5 1 02:00:00 04:00:00 850
6 2 01:00:00 02:00:00 1785

My goal is to fetch the most expensive shipping details per shipping method (for a specific product [not in OP]).

So far, I wrote this query:

SELECT 
    sm.id, sm.name, MAX(sd.price) AS max_price
FROM 
    shipping_details AS sd 
LEFT JOIN 
    shipping_method AS sm ON sm.id = sd.shipping_method_id
GROUP BY 
    sm.id

which returns:

id | name     | max_price
--- ---------- ---------
2  | Express  | 1785
1  | Standard | 1230

Through that query, I am not able to get the shipping_details columns without putting them in GROUP BY clause. I mainly need shipping details for each specific shipping method that has the higher price.

How can I achieve that?

CodePudding user response:

Use DISTINCT ON:

SELECT DISTINCT ON (sm.id) sm.id, sm.name, sd.price AS max_price
FROM shipping_details AS sd 
LEFT JOIN shipping_method AS sm
    ON sm.id = sd.shipping_method_id
ORDER BY sm.id, sd.price DESC;

The above logic will return the shipping method having the max price.

CodePudding user response:

To get additional columns from each row with the highest price in shipping_details, use DISTINCT ON:

SELECT sm.id, sm.name, sd.*
FROM   shipping_method sm
LEFT   JOIN (
   SELECT DISTINCT ON (shipping_method_id)
          shipping_method_id AS id, price AS max_price
          --  add more columns as you like
   FROM   shipping_details
   ORDER  BY shipping_method_id, price DESC NULLS LAST
   ) sd USING (id);

It's typically fastest to aggregate first, and then join.

Note the ORDER BY clause DESC NULLS LAST - since you did not declare price as NOT NULL.

An index on shipping_details (shipping_method_id, prize) makes it fast if shipping_details is big.
Or even on shipping_details (shipping_method_id, prize DESC NULLS LAST) if price can be NULL.

See:

CodePudding user response:

here is one way using lateral join:

select *
from shipping_method sm 
join lateral ( 
 select price from shipping_details sd 
 where sd.shipping_method_id = sm.id
 order by price desc limit 1
) t on TRUE;
  • Related