Home > Back-end >  Why and how does this finding the auction winner query work?
Why and how does this finding the auction winner query work?

Time:11-29

A friend of mine came up with the following query for finding the winning bets

SELECT b1.*
FROM bids as b1
LEFT JOIN bids AS b2 ON b1.item_id = b2.item_id AND b1.bid_price < b2.bid_price
WHERE b2.item_id IS NULL

It appears to be working correctly, but I don't understand how it works and whether it's by accident or the result will be always the same. Can someone please explain, how it works, and especially how does b1.bid_price < b2.bid_price work with apparently non-existent b2.bid_price?

Here's the dump:

CREATE TABLE `bids` (
  `bid_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `bid_price` decimal(10,2) unsigned NOT NULL,
  `user_id` int(11) unsigned NOT NULL,
  `item_id` int(11) unsigned NOT NULL,
  `bid_date_created` datetime NOT NULL DEFAULT current_timestamp(),
  PRIMARY KEY (`bid_id`)
) ENGINE=InnoDB AUTO_INCREMENT=18 DEFAULT CHARSET=utf8mb4

INSERT INTO bids VALUES (1, 14000.33, 12, 1, '2021-10-27 19:07:21');
INSERT INTO bids VALUES (2, 13000.60, 6, 1, '2021-10-27 18:07:21');
INSERT INTO bids VALUES (3, 21000.00, 7, 4, '2021-10-29 18:07:21');
INSERT INTO bids VALUES (4, 17000.25, 6, 4, '2021-10-27 18:07:21');
INSERT INTO bids VALUES (6, 5500.00, 6, 7, '2021-11-21 11:46:17');
INSERT INTO bids VALUES (7, 1000.00, 6, 29, '2021-11-22 11:21:41');
INSERT INTO bids VALUES (8, 18000.00, 6, 1, '2021-11-23 11:21:11');
INSERT INTO bids VALUES (9, 110.00, 14, 30, '2021-11-28 15:24:56');
INSERT INTO bids VALUES (10, 120.00, 13, 30, '2021-11-28 15:25:11');
INSERT INTO bids VALUES (11, 159.00, 14, 30, '2021-11-28 15:25:19');
INSERT INTO bids VALUES (12, 170.00, 13, 30, '2021-11-28 15:25:34');
INSERT INTO bids VALUES (13, 200.00, 14, 30, '2021-11-28 15:25:57');
INSERT INTO bids VALUES (14, 250.00, 13, 30, '2021-11-28 15:26:02');
INSERT INTO bids VALUES (15, 6000.00, 14, 6, '2021-11-28 15:26:30');
INSERT INTO bids VALUES (16, 7300.00, 13, 6, '2021-11-28 15:26:44');
INSERT INTO bids VALUES (17, 10000.00, 14, 6, '2021-11-28 15:29:14');

CodePudding user response:

The query uses LEFT OUTER JOIN to try to find a row b2 that matches the same item_id as b1, but a greater bid_price than the value in b1.

The way left outer joins work is that if there is no match, then the columns from the left table (b1) are returned anyway, and the columns of b2 are NULL. You can see that in your example data if we test the query without its WHERE clause.

mysql> SELECT b1.bid_id, b1.item_id, b2.bid_id, b2.item_id 
FROM bids as b1 LEFT JOIN bids AS b2 ON b1.item_id = b2.item_id 
  AND b1.bid_price < b2.bid_price;
 -------- --------- -------- --------- 
| bid_id | item_id | bid_id | item_id |
 -------- --------- -------- --------- 
|      2 |       1 |      1 |       1 |
|      4 |       4 |      3 |       4 |
|      1 |       1 |      8 |       1 |
|      2 |       1 |      8 |       1 |
|      9 |      30 |     10 |      30 |
|      9 |      30 |     11 |      30 |
|     10 |      30 |     11 |      30 |
|      9 |      30 |     12 |      30 |
|     10 |      30 |     12 |      30 |
|     11 |      30 |     12 |      30 |
|      9 |      30 |     13 |      30 |
|     10 |      30 |     13 |      30 |
|     11 |      30 |     13 |      30 |
|     12 |      30 |     13 |      30 |
|      9 |      30 |     14 |      30 |
|     10 |      30 |     14 |      30 |
|     11 |      30 |     14 |      30 |
|     12 |      30 |     14 |      30 |
|     13 |      30 |     14 |      30 |
|     15 |       6 |     16 |       6 |
|     15 |       6 |     17 |       6 |
|     16 |       6 |     17 |       6 |
|      3 |       4 |   NULL |    NULL |
|      6 |       7 |   NULL |    NULL |
|      7 |      29 |   NULL |    NULL |
|      8 |       1 |   NULL |    NULL |
|     14 |      30 |   NULL |    NULL |
|     17 |       6 |   NULL |    NULL |
 -------- --------- -------- --------- 

Most of the rows match b1 to some row b2 with the same item_id but a greater bid_price.

In the last six rows, the bid b1 has no other bid with a greater bid_price, so those six are the ones that are greatest for each item_id. Those are the ones you want in the result.

An easy way of filtering for those is with the WHERE clause you saw:

WHERE b2.item_id IS NULL

Really, you could use any column from b2 that is known to be non-NULL, because that means the only way it would be NULL is that there was no match because of the LEFT OUTER JOIN.

CodePudding user response:

Here's a quick breakdown of what's happening:

Let's start with the join itself. LEFT JOIN indicates that it will get all results from the left side (b1) and any corresponding results from the right side (b2) if they exist. If not, all the columns in b2 will be null (keep this in mind as it plays a huge part in how this query works).

Next, it joins on item_id, so it only gets the prices for that specific item. Then, it has another join condition, b1.bid_price < b2.bid_price. Any comparison with NULL will ultimately be evaluated as false. Keep in mind that both of these conditions have to be met or else all the rows in b2 are null, and also that b2 is representing each entry in the bets table. This means that if b2 is null, it's either the only bet for that item (because of the b1.item_id = b2.item_id constraint), or it's the highest (because of the b1.bid_price < b2.bid_price constraint).

Lastly, the WHERE clause. As mentioned above, the JOIN clause is set up perfectly so that the highest bids will all have NULL values in b2. That makes this part easy, because now the WHERE clause can simply get every instance where the joined b2 has NULL values.

  • Related