For example I have a query:-
Select *
from table
where (column1 = 'A' and Column2 = 'B' and Column3 = 'C')
OR (Column1= 'z' and Column4 = 'X');
If I have to create a DB index for this query,
Do we need separate DB index for the first half and second half of the OR condition of the query ?
or
A single DB index would be enough containing all columns present in the query?
CodePudding user response:
TL;DR
Create 2 INDEX
es, one on Columns1-3 and the other on Columns1 & 4, INCLUDE
all the other columns in the table on both INDEX
es, and then use a UNION ALL
query instead.
Demonstration
As I mention in the comment, if you wanted to do this in a single query, without a UNION
(unlike in
Notice this have 2 table scans, as there are no INCLUDE
s (this is not Column5
's fault, and I demonstrate this later)
Single SELECT
with an INDEX
on Column1
, but Column5
excluded from INCLUDE
:
CREATE INDEX idx_larnu ON dbo.YourTable (Column1) INCLUDE(Column2, Column3, Column4);
GO
SELECT *
FROM dbo.YourTable
WHERE (Column1 = 'A' AND Column2 = 'B' AND Column3 = 'C')
OR (Column1= 'z' AND Column4 = 'X');
GO
Note this too has a table scan, due to the ommission of Column5
. If we add that to the INCLUDE
:
CREATE INDEX idx_larnu_fullinc ON dbo.YourTable (Column1) INCLUDE(Column2, Column3, Column4,Column5);
GO
SELECT *
FROM dbo.YourTable
WHERE (Column1 = 'A' AND Column2 = 'B' AND Column3 = 'C')
OR (Column1= 'z' AND Column4 = 'X');
GO
This unfortunately still results in a scan, but of the index this time not the table. For a larger table or for different values of Column1
, SQL Server may make a different decision.
Now let's just add an INCLUDE
to Tim's indexes:
--Drop mine so that they definitely aren't used
DROP INDEX idx_larnu_fullinc ON dbo.YourTable;
DROP INDEX idx_larnu ON dbo.YourTable;
CREATE INDEX idx1_tim_inc ON dbo.yourTable (Column1, Column2, Column3) INCLUDE (Column4,Column5);
CREATE INDEX idx2_tim_inc ON dbo.yourTable (Column1, Column4) INCLUDE (Column2, Column3,Column5);
GO
SELECT *
FROM dbo.yourTable
WHERE Column1 = 'A'
AND Column2 = 'B'
AND Column3 = 'C'
UNION ALL
SELECT *
FROM dbo.yourTable
WHERE Column1 = 'z'
AND Column4 = 'X';
GO
Hazaar! This is what we want. Also, just to confirm however that Column5
wasn't causing the scans earlier, let's drop those INDEX
es and also Column5
:
DROP INDEX idx1_tim_inc ON dbo.YourTable;
DROP INDEX idx2_tim_inc ON dbo.YourTable;
ALTER TABLE dbo.YourTable DROP COLUMN Column5;
GO
SELECT *
FROM dbo.yourTable
WHERE Column1 = 'A'
AND Column2 = 'B'
AND Column3 = 'C'
UNION ALL
SELECT *
FROM dbo.yourTable
WHERE Column1 = 'z'
AND Column4 = 'X';
GO
So, there you can see that Column5
wasn't the cause the scans
--Clean up
DROP TABLE dbo.YourTable;
CodePudding user response:
OR
conditions make indexing tricky. We can try the following union approach:
SELECT * FROM yourTable WHERE Column1 = 'A' AND Column2 = 'B' AND Column3 = 'C'
UNION ALL
SELECT * FROM yourTable WHERE Column1 = 'z' AND Column4 = 'X';
Then add the following indices to cover the WHERE
clauses from both halves of the union query:
CREATE INDEX idx1 ON yourTable (Column1, Column2, Column3);
CREATE INDEX idx2 ON yourTable (Column1, Column4);