Home > other >  How to insert new rows and mark existing rows with no update and updated rows in SQL Server using me
How to insert new rows and mark existing rows with no update and updated rows in SQL Server using me

Time:11-01

I need to create a stored procedure in SQL Server and implement an upsert such that it will move data from a staging table (a.k.a Source) to final table (a.k.a target) and mark rows which are new, updated, not updated or deleted every time a new batch of data comes in. I am using merge as explained enter image description here

Notice row number 4 and 5. There has been no change in the input row data for Anup still I get the [is_deleted] column as "Updated". I want it to be something like "Existing" or "No change".

Please help in making this possible. This upsert logic is a part of a big pipeline and we need the rows that are updated, new, not updated or deleted in the new file. How do I achieve this?

CodePudding user response:

You can add a case statement to the is_deleted field in the update to check whether anything has changed. Something like this:

MERGE [dbo].[employee] as Target
USING [dbo].[employee_stagging] as Source
ON Source.[first_name] = Target.[first_name] and
   Source.[last_name] = Target.[last_name] and
   Source.[dob] = Target.[dob]

WHEN MATCHED
THEN
    UPDATE 
    SET Target.[salary] = Source.[salary],
        Target.[current_address] = Source.[current_address],
        Target.[is_deleted] = CASE WHEN Source.salary = Target.salary
                                         AND Source.current_address = Target.current_address THEN 'No change'
                                   ELSE 'Updated'
                               END,
        Target.[processed_date] = @current_time

WHEN NOT MATCHED BY Target 
THEN
    INSERT ([first_name],
            [last_name],
            [dob],
            [salary], 
            [current_address], 
            [is_deleted],
            [processed_date])

    VALUES (Source.[first_name],
            Source.[last_name],
            Source.[dob],
            Source.[salary], 
            Source.[current_address], 
            'New',
            @current_time
            );

So if the update fields are identical to the staged data, it will update is_deleted to "No change", whereas if the update fields have changed it will update is_deleted to "Updated".

NOTE: This code is assuming salary and current_address are non-nullable fields (table definition for field is followed by NOT NULL). If there ARE nullable, then you should handle the null values by replacing the CASE statement with the following:

CASE WHEN (Source.salary = Target.salary
        OR Source.salary IS NULL
            AND Target.salary IS NULL)
    AND (Source.current_address = Target.current_address
        OR Source.current_address IS NULL
            AND Target.current_address IS NULL) THEN 'No change'
    ELSE 'Updated'
END

(Documentation here)

  • Related