In my long, complicated query that is not using aggregation, I have moved one of the AND
ed where clause parts to a new HAVING
clause.
Logically, the result is the same, rows are filtered before returned.
Semantically, the result may be different in some way I don't understand.
But performance-wise, this runs 3x faster. I understand this is because the thing I moved is doing an expensive NOT EXISTS (SELECT ...)
. Previously the server was spending time evaluating this for rows that could be excluded using the other simpler rules.
Are there any official or unofficial rules I have broken with this optimization approach?
CodePudding user response:
no there are no rules as such.
As the joins come before the WHERE clause, you would reduce the number of rows, that will be checked against the WHERE clause.
It is usually somewhat fawned upon, because you could miss some rows that are needed.
So basically you can do it, but have to check , if all wanted rows are there.
CodePudding user response:
- The order of
WHERE
clausesANDed
together --> The optimizer if free to rearrange, however - There are some exceptions:
FULLTEXT
search first; subqueries last. (I am not sure of this.) - Referencing aggregations --> must be in
HAVING
- Otherwise
WHERE
andHAVING
have the same semantics. WHERE
is logically done beforeGROUP BY
;HAVING
is done after.- It seems that you have discovered that
NOT EXISTS
is more efficient if it is somehow forced to come after other tests; and moving it toHAVING
seems to have achieved that.
Submit a bug report (jira.mariadb.com) suggesting that you have found a case where the Optimizer is not juggling them the clauses as well as it should.
If you show us the actual query, we might be able to dig deeper.