I have this query:
SELECT TOP 1
MAX(HORA_LEIT), ST_BOMBA, Q_BOMBA, SEQUENCIAL
FROM
DADOS
WHERE
COD_PONTO = 2085
AND (ST_BOMBA = 'ON' OR ST_BOMBA = 'OFF')
GROUP BY
ST_BOMBA, Q_BOMBA, SEQUENCIAL
ORDER BY
MAX(HORA_LEIT) DESC
I decided to create two indexes:
CREATE INDEX ix_1
ON dados (cod_ponto, St_bomba)
INCLUDE (q_bomba, sequencial, hora_leit)
WHERE St_bomba IN ('ON', 'OFF')
Actual execution plan: https://www.brentozar.com/pastetheplan/?id=HkxiKmDXs
and
CREATE INDEX ix_2
ON dados (cod_ponto, hora_leit)
INCLUDE (St_bomba, q_bomba, sequencial)
WHERE St_bomba IN ('ON', 'OFF')
Actual execution plan: https://www.brentozar.com/pastetheplan/?id=By_1tmDQj
I figured out, as far as I can see, that the first execution plan is better, although the query optimizer is giving preference to the second one.
Am I misevaluating the performance?
CodePudding user response:
Aside from marc said in the comments, the second query plan only has to seek into 1 row in the index (Actual Number of Rows
) and has a perfect cardinality estimate then because it also estimated 1 row.
The first execution plan estimates around 250 rows but actually seeks into 95 rows on the index, so slightly less efficient and a worse cardinality estimate.
But to be honest, for such a simple query plan and small amount of data, you might find the best way to compare is by looking at the TIME STATISTICS
and IO STATISTICS
. E.g. run SET STATISTICS TIME, IO ON;
first, then run each query that produces each plan above and compare. The results will be in the Messages window in SSMS.
TIME STATISTICS
will give you the Parse and Compile Time of the query (which shouldn't be super relevant for this test) and the total CPU and Elapsed Time.
IO STATISTICS
will tell you how many data pages were read from Memory (the Logical Reads) to serve your query.
The plan that required less of either or both is generally better.
CodePudding user response:
The ix_2
is clearly better.
The first thing to note is that the query is written in a very convoluted way.
The query
SELECT TOP 1
MAX(HORA_LEIT), ST_BOMBA, Q_BOMBA, SEQUENCIAL
FROM
DADOS
WHERE
COD_PONTO = 2085
AND (ST_BOMBA = 'ON' OR ST_BOMBA = 'OFF')
GROUP BY
ST_BOMBA, Q_BOMBA, SEQUENCIAL
ORDER BY
MAX(HORA_LEIT) DESC
is equivalent to
SELECT TOP 1
HORA_LEIT, ST_BOMBA, Q_BOMBA, SEQUENCIAL
FROM
DADOS
WHERE
COD_PONTO = 2085
AND (ST_BOMBA = 'ON' OR ST_BOMBA = 'OFF')
ORDER BY
HORA_LEIT DESC
As you are only interested in the TOP 1
the GROUP BY
can be optimized out here.
Both execution plans recognize this and don't contain any aggregation operators.
For index 2 the filtered index condition guarantees that all rows match the ST_BOMBA
condition. It just has to do a backward ordered index seek on COD_PONTO = 2085
and read the first row and then stop (as the second key column is HORA_LEIT
).
For index 1 the index seek is actually two seeks.
- seek on
(COD_PONTO, ST_BOMBA)
=(2085, 'ON')
- seek on
(COD_PONTO, ST_BOMBA)
=(2085, 'OFF')
The combined rows matching either of those conditions then go into the Top N sort to get the TOP 1 matching row as ordered by HORA_LEIT DESC
. It is only 96 rows in this case but is potentially unbounded and just depends on your data.
Reading a single row and stopping is better than reading an arbitrary number of rows and sorting them.