Is there a way around the unique index? I want my query to continue to run even after it hits the duplicate key value. The information that is a duplicate, I do not want to insert. However, there could be data after the duplicate that is not a duplicate. The unique indexed column is WebQuoteNo
. I have previously pulled quotes from SalesOrderData and now I want to pull any NEW quotes and insert into OrderHeader. I tried the EXCEPT statement but got the same error. "Cannot insert duplicate key"
INSERT INTO OrderHeader (WebQuoteNo, Date, OrderType, ListTotal, DiscountTotal, WebEmail)
SELECT
WebQuoteNo, Date, OrderType, ListTotal, discount AS DiscountTotal, WebEmail
FROM
SalesOrderData
Here are the two tables. The Quotes from 7/26/2022 need to be add to the OrderHeader table.
This is what my table currently look like
CodePudding user response:
There are three things that can cause a unique constraint/index violation:
- You are trying to insert a single row where the values defined for the columns in the constraint are the same in a row that already exists in the target.
- You are trying to insert multiple rows that violate the constraint even though a row doesn't already exist in the target.
- Both.
Let's say you have a table like:
CREATE TABLE dbo.OrderHeader
(
WebQuoteNo varchar(32) NOT NULL UNIQUE,
Date date,
OrderType char(1),
ListTotal decimal(12,2),
Discount decimal(12,2)
);
The way you typically avoid 1. is:
INSERT dbo.OrderHeader
(
WebQuoteNo, Date, OrderType, ListTotal, Discount
)
SELECT
WebQuoteNo, Date, OrderType, ListTotal, Discount
FROM dbo.SalesOrderData AS s
WHERE NOT EXISTS
(
SELECT 1
FROM dbo.OrderHeader AS h WITH (UPDLOCK, SERIALIZABLE)
WHERE h.WebQuoteNo = s.WebQuoteNo
);
- Example db<>fiddle
If you also need to guard against 2., you need to figure out which duplicate you want to keep. You can use something like this:
;WITH src AS
(
SELECT WebQuoteNo, /* other columns, */
rn = ROW_NUMBER() OVER (PARTITION BY WebQuoteNo
ORDER BY <some other deterministic criteria, maybe DATE?>
FROM dbo.SalesOrderData AS s
WHERE NOT EXISTS
(
SELECT 1
FROM dbo.OrderHeader AS h WITH (UPDLOCK, SERIALIZABLE)
WHERE h.WebQuoteNo = s.WebQuoteNo
)
-- and other criteria
)
INSERT dbo.OrderHeader(WebQuoteNo /* , other columns */)
SELECT WebQuoteNo /* , other columns */
FROM src
WHERE rn = 1;
Before you even think about using some suggestion that flings MERGE
over the wall as a solution to an INSERT
problem without also talking about concurrency and isolation, please read all of these posts and to understand the purpose of the locking hints applied here see this post about upserts.
CodePudding user response:
Assumption 1: You have a process that PULL sales data, and that data cause violation of PK for sales header record
Assumption 2: Sales data most commonly put a record per item, since when you get a batch of records, those may contain duplicate of header
Assumption 3: Sales often do corrections and may send same record with new price (total) again. And again try to break constraint.
Assumption 4: Since you pulling batches this is nightly process and most likely during run users are not using system
Option 1: Will fix Assumption 2, use DISTINCT for inserted rows.
SELECT DISTINCT * FROM (SELECT
WebQuoteNo, Date, OrderType, ListTotal, discount AS DiscountTotal, WebEmail
FROM
SalesOrderData ) as a
Option 2: If you suffer from sales data update Assumption 3, you most likely can use MERGE, if Assumption 4 is correct.
The short how to is here.
MERGE has following caveats that you should fully accept before trying:
Most likely you will need to fully lock table during update process, so if it is impacting users and that is unacceptable "MERGE" is not your option.
If you cannot fully lock table you will need to follow suggestion from Aaron Bertrand here https://sqlblog.org/merge and be super cautios.
In General MERGE work well for nightly process, during non-business hours.