Home > Net >  Performance gain on checking changed data before UPDATE on SQL Server
Performance gain on checking changed data before UPDATE on SQL Server

Time:07-04

We are doing quite a lot of insert/updates on a table on Azure SQL Server.

Is there any gain of loading some columns first (if update) and checking if they have changed compared to the stored data before doing an UPDATE?

To avoid logging, writing and index updates when the data is identical.

BTW: This is in a stored procedure, and the parameters (not shown here) are the new data.

'''

DECLARE @CurrentCustomerDisplayName nvarchar(50);
DECLARE @CurrentCancelledAt datetimeoffset(7);
DECLARE @CurrentCustomerId bigint;
DECLARE @CurrentTrackingNo nvarchar(50);
DECLARE @CurrentTags nvarchar(2000);
DECLARE @CurrentToCountryCode nchar(2);


SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

BEGIN TRANSACTION
    -- First load the current row data from the table
    SELECT TOP(1) 
        @CurrentCustomerDisplayName = CustomerDisplayName,
        @CurrentCancelledAt = CancelledAt,
        @CurrentCustomerId = CustomerId,
        @CurrentTrackingNo = TrackingNo,
        @CurrentTags = Tags,
        @CurrentToCountryCode = ToCountryCode
    FROM dbo.Orders with (UPDLOCK)
    where  OrderId = @orderId;

    -- Check if exist
    if @@ROWCOUNT = 0
    BEGIN
        -- Do insert
    END
    ELSE
    BEGIN
        -- Check if data has changed
        IF ((@CurrentCustomerDisplayName IS NULL AND @customerDisplayName IS NULL) OR (@CurrentCustomerDisplayName IS NOT NULL AND @customerDisplayName IS NOT NULL AND @CurrentCustomerDisplayName = @customerDisplayName))
            AND ((@CurrentCancelledAt IS NULL AND @cancelledAt IS NULL) OR (@CurrentCancelledAt IS NOT NULL AND @cancelledAt IS NOT NULL AND @CurrentCancelledAt = @cancelledAt))
            AND ((@CurrentCustomerId IS NULL AND @customerId IS NULL) OR (@CurrentCustomerId IS NOT NULL AND @customerId IS NOT NULL AND @CurrentCustomerId = @customerId))
            AND ((@CurrentTrackingNo IS NULL AND @trackingNo IS NULL) OR (@CurrentTrackingNo IS NOT NULL AND @trackingNo IS NOT NULL AND @CurrentTrackingNo = @trackingNo))
            AND ((@CurrentTags IS NULL AND @tags IS NULL) OR (@CurrentTags IS NOT NULL AND @tags IS NOT NULL AND @CurrentTags = @tags))
            AND ((@CurrentToCountryCode IS NULL AND @toCountryCode IS NULL) OR (@CurrentToCountryCode IS NOT NULL AND @toCountryCode IS NOT NULL AND @CurrentToCountryCode = @toCountryCode))
        BEGIN
            -- Do nothing, identical data
        END
        ELSE
        BEGIN
            -- Do UPDATE
        END
    END
COMMIT

'''

CodePudding user response:

There is a performance gain by not updating unnecessary rows, but you're likely going to lose out on this performance by splitting the process up into multiple statements.

With your current code, whatever happens you'll always execute at least two queries on the Orders table. You can just do the insert and make this your check for if the row exists, that way if the required action is to insert, then you only require one query which will be more efficient than two. e.g.

BEGIN TRANSACTION

INSERT dbo.Orders (CustomerDisplayName, CancelledAt, CustomerId, TrackingNo, Tags, ToCountryCode)
SELECT  @CustomerDisplayName, @CancelledAt, @CustomerId, @TrackingNo, @Tags, @CurrentToCountryCode
WHERE   NOT EXISTS (SELECT 1 FROM dbo.Orders WITH (UPDLOCK, SERIALIZABLE) WHERE OrderId = @orderId)

IF @@ROWCOUNT = 0
BEGIN
    UPDATE  dbo.Orders
    SET     CustomerDisplayName = @CustomerDisplayName,
            CancelledAt = @CancelledAt,
            CustomerId = @CustomerId,
            TrackingNo = @TrackingNo,
            Tags= @Tags,
            ToCountryCode = @ToCountryCode
    WHERE   OrderId = @orderId
    AND     NOT EXISTS 
            (   SELECT  CustomerDisplayName, CancelledAt, CustomerId, TrackingNo, Tags, ToCountryCode
                INTERSECT
                SELECT  @CustomerDisplayName, @CancelledAt, @CustomerId, @TrackingNo, @Tags, @CurrentToCountryCode
            );
END

COMMIT TRANSACTION;

A couple of references for the above that will provide a bit more reading. For the general INSERT/UPDATE pattern see Please stop using this UPSERT anti-pattern by Aaron Bertrand. For a bit more on the where clause above using NOT EXISTS ( ... INTERSECT ...) see the following article from Paul White: Undocumented Query Plans: Equality Comparisons. It is effectively the SQL-Server of the ANSI standard IS DISTINCT FROM.

CodePudding user response:

Additional info: Most of the time there will be an UPDATE with no data changed because this is linked to a webhook which fire many times when the data we use is not changed.

  • Related