Should I use sp_getapplock to prevent multiple instances of a stored procedure that conditionally in-CodePudding

Hear me out! I know this use case sounds suspect, but...

I have a stored procedure which checks a table (effectively a cache) for data for a given requested ID. If it doesn't find any data for that ID, or deems it out of date, it executes a second stored procedure which will pull data from a separate database (using dynamic SQL, source DB name is based on the requested ID) and insert it into the local table. It then selects from this table.

If the data is in the table, everything returns quickly (ms), but if it needs to be brought in from the other database, it takes about 10 seconds. We're seeing race conditions where two concurrent instances check the local cache, see something is missing, and queue up sequential ingestions of the remote data into the cache. To avoid double-insertion, the cache-populating procedure will clear whatever is already there for this id, but this just means the first instance of the procedure can selecting no rows because the second instance deleted the just-inserted records before re-inserting them itself.

I think I want to put a lock around the entire procedure (checking the cache, potentially populating the cache, selecting from the cache) - although I'm open to other solutions. I think the overall caching approach has to remain on-demand though, the remote databases come and go by the hundreds, and we only want to cache the ones actually requested by reporting as-needed.

BEGIN TRANSACTION;
BEGIN TRY
    -- Take out a lock intended to prevent anyone else modifying the cache while we're reading and potentially modifying it
    EXEC sp_getapplock @Resource = '[private].[cache_entries]', @LockOwner='Transaction', @LockMode = 'Exclusive', @LockTimeout = 120000; 

    -- Invoke a stored procedure that ingests any required data that is not already cached
    EXEC [private].populate_cache @required_dbs

    -- CALCULATIONS
    -- ... SELECT FROM [private].cache_entries
    
    COMMIT TRANSACTION; -- Free the lock
END TRY
BEGIN CATCH --Ensure we release our lock on failure
    ROLLBACK TRANSACTION;
    THROW
END CATCH;

CodePudding user response：

The alternative to sp_getapplock is to use locking hints with your transaction. Both are reasonable approaches. Locking hints can be complex, but they protect the target object itself rather than a single code path. So sometimes necessary. sp_getapplock is simple (with Transaction as owner), and reliable.

CodePudding user response：

You can do this without sp_getapplock, which tends to inhibit concurrency a lot.

The way to do this is to continue do your checks within a transaction, but to apply a HOLDLOCK hint, as well as a UPDLOCK hint.

HOLDLOCK aka the SERIALIZABLE isolation level, will place a lock not only on the ID you specify, but even on the absence of such data, in other words it will prevent anyone else inserting into that ID.

You must use both these hints, as well as have an index that matches that SELECT, otherwise you could run into major blocking and deadlocking problems due to full table scans.

Also, you don't need a CATCH and ROLLBACK. Just use SET XACT_ABORT ON; which ensures a rollback in any event of an error.

SET XACT_ABORT ON;   -- always have this set

BEGIN TRANSACTION;

DECLARE @SomeData nvarchar(100) = (
    SELECT ce.SomeColumn
    FROM [private].cache_entries ce WITH (HOLDLOCK, UPDLOCK)
    WHERE ce.SomeCondition = 1
);

IF @SomeData IS NULL
BEGIN
    -- Invoke a stored procedure that ingests any required data that is not already cached
    EXEC [private].populate_cache @required_dbs
END

-- CALCULATIONS
-- ... SELECT FROM [private].cache_entries

COMMIT TRANSACTION; -- Free the lock