I have this SQL query that inserts records from one table to another without duplicates.
It works fine, if I call this SQL query from one instance of my application. But in production, the application is horizontally scaled, having more than one instance of application, each calling below query simultaneously at the same time. That is causing duplicate records to me. Is there any way to fix this query, so it allows simultaneous hits?
INSERT INTO table1 (col1, col2)
SELECT DISTINCT TOP 10
t2.col1,
t2.col2
FROM
table2 t2
LEFT JOIN
table1 t1 ON t2.col1 = t1.col1
AND t2.col2 = t1.col2
WHERE
t1.col1 IS NULL
CodePudding user response:
The corrective action here depends on the behavior you want. If you intend to allow for just a single horizontal instance of your application to execute this query, then you need to create a critical section, into which one instance is allowed to enter. Since you are already using SQL Server, you could implement by forcing each instance to get a lock on a certain table. Only the instance which gets the lock will execute the query, and the others will drop off.
If, on the other hand, you really want each instance to execute the query, then you should use a serializable transaction. Using a serializable transaction will ensure that only one instance can do the insert on the table at a given time. It would not be possible for two or more instances to interleave and execute the same insert.