Home > other >  Using GROUP BY to get the entry with the highest value
Using GROUP BY to get the entry with the highest value

Time:11-26

I need to create a product list with a preview image for each product.

I have a pretty simple data structure for products. One table is for the products, and one table for the images of a product. A product can have any number of images. The structure looks like this:

PRODUCT
id | name
1    Test Product A
2    Test Product B
3    Test Product C

PRODUCTIMAGE
id | productId | file             | priority
1    1           foo.jpg            0
2    1           bar.jpg            1
3    2           something.png      1
4    2           yada.png           0
5    1           yougettheidea.gif  2

Pretty straight forward. The only thing worth mentioning about this is that productimages have a "priority", which is a TINYINT to determine the display order of images for a given product. The idea is: The higher the priority, the earlier the image should be displayed in the list of product images on the detail page. But for this product list that we are about to create, I'm only gonna need one preview image per product.

So as stated initially, my goal is to get a list of all products. So let's start pretty simple:

SELECT *
FROM product

Now I also want to display one preview image in the product page, so I need a little join:

SELECT `p`.*,
    `pi`.`file` `previewImage`
FROM `product` `p`
LEFT JOIN `productImage` `pi` ON (`pi`.`productId` = `p`.`id`)
GROUP BY `p`.`id`

So far so good, this gives me one preview image per product to display on the product list. Just one more step to go: I want the preview image with the highest priority for each product as the preview image. So I tried to use a subquery to get the product images in the desired priority order:

SELECT `p`.*,
    `pi`.`file` `previewImage`
FROM `product` `p`
LEFT JOIN (
    SELECT *
    FROM `productImage`
    ORDER BY `priority` DESC
) `pi` ON (`pi`.`productId` = `p`.`id`)
GROUP BY `p`.`id`

But for some reason this doesn't (reliably) get me the product image with the highest priority for each product. Why is that? I think that GROUP BY is selecting the wrong productImage entry to keep, but why? Shouldn't it pick the first one, which due to the subquery should be the one with the highest priority?

CodePudding user response:

Your group by is a partial group by. You are grouping by product.id so MySQL will group the result per product, but within each group it is free to return any row from productimage table. To get deterministic results, each column from that table needs to be wrapped inside aggregate functions (MIN, MAX, etc) but that will not give you the image with highest priority.

That being said, if you want only one column from the productimage table you can use a subquery inside select:

select product.*, (
    select file
    from productimage
    where productimage.productid = product.id
    order by priority desc
    limit 1 -- this is the important bit
) as productimage_file
from product
  • Related