My update query checks whether the column “Houses” is null in any of the rows in my source table by joining the id between the target & source table (check query one). The column Houses being null in this case indicates that the row has expired; thus, I need to expire the row id in my target table and set the expired date. The query works fine, but I was wondering if it can be improved; I'm new to SQL, so I don't know if using two joins is the best way to accomplish the result I want. My update query will later be used against millions of rows. No columns has been indexed yet.
Query:
(Query one)
Update t
set valid_date = GETDATE()
From Target T
JOIN SOURCE S ON S.ID = T.ID
LEFT JOIN SOURCE S2 ON S2.Houses = t.Houses
WHERE S2.Houses is null
Target:
ID | namn | middlename | Houses | date |
---|---|---|---|---|
1 | demo | hello | 2 | null |
2 | demo2 | test | 4 | null |
3 | demo3 | test1 | 5 | null |
Source:
ID | namn | middlename | Houses |
---|---|---|---|
1 | demo | hello | null |
3 | demo | world | null |
Expected output after running update query :
ID | namn | middlename | Houses | date |
---|---|---|---|---|
1 | demo | hello | 2 | 2022-12-06 |
2 | demo2 | test | 4 | null |
3 | demo3 | test1 | 5 | 2022-12-06 |
CodePudding user response:
I would recommend exists
:
update t
set valid_date = getdate()
from target t
where exists (select 1 from source s where s.id = t.id and s.houses is null)
Note that your original query does not exactly do what you want. It cannot distinguish source rows that do not exist from source rows that exist and whose houses
column is null
. In your example, it would update row 2
, which is not what you seem to want. You would need an INNER JOIN
instead of the LEFT JOIN
.
With EXISTS
, you want an index on source(id, houses)
so the subquery can execute efficiently against target
rows. This index is probably worthwhile for the the JOIN
as well.
CodePudding user response:
I don't see why you'd need to join on the column houses
at all.
Find all rows in source
that have value NULL in the column houses
.
Then update all rows in target
that have the ID
s of the source rows.
I prefer to write these kind of complex updates using CTEs. It looks more readable to me.
WITH
CTE
AS
(
SELECT
Target.ID
,Target.Date
FROM
Source
INNER JOIN Target ON Target.ID = Source.ID
WHERE Source.Houses IS NULL
)
UPDATE CTE
SET Date = GETDATE();
To efficiently find rows in source
that have value NULL in the column houses
you should create an index, something like this:
CREATE INDEX IX_Houses ON Source
(
Houses
);
I assume that ID
is a primary key with a clustered unique index, so ID
would be included in the IX_Houses
index implicitly.