We are doing quite a lot of insert/updates on a table on Azure SQL Server.
Is there any gain of loading some columns first (if update) and checking if they have changed compared to the stored data before doing an UPDATE?
To avoid logging, writing and index updates when the data is identical.
BTW: This is in a stored procedure, and the parameters (not shown here) are the new data.
'''
DECLARE @CurrentCustomerDisplayName nvarchar(50);
DECLARE @CurrentCancelledAt datetimeoffset(7);
DECLARE @CurrentCustomerId bigint;
DECLARE @CurrentTrackingNo nvarchar(50);
DECLARE @CurrentTags nvarchar(2000);
DECLARE @CurrentToCountryCode nchar(2);
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
-- First load the current row data from the table
SELECT TOP(1)
@CurrentCustomerDisplayName = CustomerDisplayName,
@CurrentCancelledAt = CancelledAt,
@CurrentCustomerId = CustomerId,
@CurrentTrackingNo = TrackingNo,
@CurrentTags = Tags,
@CurrentToCountryCode = ToCountryCode
FROM dbo.Orders with (UPDLOCK)
where OrderId = @orderId;
-- Check if exist
if @@ROWCOUNT = 0
BEGIN
-- Do insert
END
ELSE
BEGIN
-- Check if data has changed
IF ((@CurrentCustomerDisplayName IS NULL AND @customerDisplayName IS NULL) OR (@CurrentCustomerDisplayName IS NOT NULL AND @customerDisplayName IS NOT NULL AND @CurrentCustomerDisplayName = @customerDisplayName))
AND ((@CurrentCancelledAt IS NULL AND @cancelledAt IS NULL) OR (@CurrentCancelledAt IS NOT NULL AND @cancelledAt IS NOT NULL AND @CurrentCancelledAt = @cancelledAt))
AND ((@CurrentCustomerId IS NULL AND @customerId IS NULL) OR (@CurrentCustomerId IS NOT NULL AND @customerId IS NOT NULL AND @CurrentCustomerId = @customerId))
AND ((@CurrentTrackingNo IS NULL AND @trackingNo IS NULL) OR (@CurrentTrackingNo IS NOT NULL AND @trackingNo IS NOT NULL AND @CurrentTrackingNo = @trackingNo))
AND ((@CurrentTags IS NULL AND @tags IS NULL) OR (@CurrentTags IS NOT NULL AND @tags IS NOT NULL AND @CurrentTags = @tags))
AND ((@CurrentToCountryCode IS NULL AND @toCountryCode IS NULL) OR (@CurrentToCountryCode IS NOT NULL AND @toCountryCode IS NOT NULL AND @CurrentToCountryCode = @toCountryCode))
BEGIN
-- Do nothing, identical data
END
ELSE
BEGIN
-- Do UPDATE
END
END
COMMIT
'''
CodePudding user response:
There is a performance gain by not updating unnecessary rows, but you're likely going to lose out on this performance by splitting the process up into multiple statements.
With your current code, whatever happens you'll always execute at least two queries on the Orders
table. You can just do the insert and make this your check for if the row exists, that way if the required action is to insert, then you only require one query which will be more efficient than two. e.g.
BEGIN TRANSACTION
INSERT dbo.Orders (CustomerDisplayName, CancelledAt, CustomerId, TrackingNo, Tags, ToCountryCode)
SELECT @CustomerDisplayName, @CancelledAt, @CustomerId, @TrackingNo, @Tags, @CurrentToCountryCode
WHERE NOT EXISTS (SELECT 1 FROM dbo.Orders WITH (UPDLOCK, SERIALIZABLE) WHERE OrderId = @orderId)
IF @@ROWCOUNT = 0
BEGIN
UPDATE dbo.Orders
SET CustomerDisplayName = @CustomerDisplayName,
CancelledAt = @CancelledAt,
CustomerId = @CustomerId,
TrackingNo = @TrackingNo,
Tags= @Tags,
ToCountryCode = @ToCountryCode
WHERE OrderId = @orderId
AND NOT EXISTS
( SELECT CustomerDisplayName, CancelledAt, CustomerId, TrackingNo, Tags, ToCountryCode
INTERSECT
SELECT @CustomerDisplayName, @CancelledAt, @CustomerId, @TrackingNo, @Tags, @CurrentToCountryCode
);
END
COMMIT TRANSACTION;
A couple of references for the above that will provide a bit more reading. For the general INSERT/UPDATE
pattern see Please stop using this UPSERT anti-pattern by Aaron Bertrand. For a bit more on the where clause above using NOT EXISTS ( ... INTERSECT ...)
see the following article from Paul White: Undocumented Query Plans: Equality Comparisons. It is effectively the SQL-Server of the ANSI standard IS DISTINCT FROM.
CodePudding user response:
Additional info: Most of the time there will be an UPDATE with no data changed because this is linked to a webhook which fire many times when the data we use is not changed.