Home > Enterprise >  SQL query on messages table returns not all results. Strange behavior
SQL query on messages table returns not all results. Strange behavior

Time:04-24

Can someone help. I have strange behavior on SQL query to "messages" table. It supposed to return MAX(id)'s for each user contacted target user.

Table is simple:

id(int, ai) | from(int) | dest(int) | text(txt) | time(int) | msg_status(int)

USERS table have only 5 test users.

MSG table have about 40 messages.

When I query most of user ids(1, 2, 3, 4) - I receive normal result.

When I query one specific user No.5- I receive ONE less result.

The query is:

SELECT MAX(`id`) FROM `msg` WHERE `from` = '5' OR `dest` = '5'
GROUP BY (IF(`from` > `dest`,  `from`, `dest`)), (IF(`from` < `dest`,  `dest`, `from`));

For most users it gives normal result. For example for user 1 I have:

  • MAX(id) 37, 30, 33, 36

And it is OK as user No.1 have conversation messages with all other 4 users.

But for user No.5 I have:

  • MAX(id) 36

Thus this is not correct. As user No.5 have last messages as described here:

id  from dest text
35  5    2    hellp
36  5    1    hi there

So there is one less result, as it have to be something like:

  • MAX(id) 35, 36

But it is not.

Can someone suggest what is wrong?

UPD.

Simplifying the query:

 SELECT * FROM `msg` WHERE `id` IN (SELECT MAX(`id`) FROM `msg`
    WHERE `from` = '5' OR `dest` = '5' GROUP BY `from`, `dest`);

I receive result:

id  from desr text 
32  1    5    test
35  5    2    hello
36  5    1    test2 

So oroginal query have to produce 35 and 36 result, thus giving 36 only...

CodePudding user response:

The 2 GROUP BY statements should return the same value (5) for both Id 35 and 36 so MAX(id) will return 36 - which is what you are getting.

Your SQL matches your result so it all seems to be working. If the result is not what you want then you’ll need to change your SQL - and if you want help with that then you’ll need to explain what the logic is that would return IDs 35 and 36

CodePudding user response:

There's a logic error in the GROUP BY expressions:

GROUP BY 
    (IF(`from` > `dest`,  `from`, `dest`)),
    (IF(`from` < `dest`,  `dest`, `from`));

If from > dest, then this is equivalent to GROUP BY from, from; if from < dest, then this is GROUP BY dest, dest. For the two example rows with from = 5, they're grouped by from, and thus in the same group, and thus MAX(id) is 36, with no 35 in the results.

In contrast, ID 1 will be the minimum when compared to any other user ID, so when querying for ID 1, the query will group by the other IDs, guaranteeing they remain separate groups. That is why the query works for ID 1.

To avoid this error, it's better to use the Greatest and Least functions:

  GROUP BY Greatest(`from`, `dest`), Least(`from`, `dest`)
  • Related