SQL Server insert into with an indexed column-CodePudding

Is there a way around the unique index? I want my query to continue to run even after it hits the duplicate key value. The information that is a duplicate, I do not want to insert. However, there could be data after the duplicate that is not a duplicate. The unique indexed column is WebQuoteNo. I have previously pulled quotes from SalesOrderData and now I want to pull any NEW quotes and insert into OrderHeader. I tried the EXCEPT statement but got the same error. "Cannot insert duplicate key"

INSERT INTO OrderHeader (WebQuoteNo, Date, OrderType, ListTotal, DiscountTotal, WebEmail)
    SELECT        
        WebQuoteNo, Date, OrderType, ListTotal, discount AS DiscountTotal, WebEmail
    FROM          
        SalesOrderData

Here are the two tables. The Quotes from 7/26/2022 need to be add to the OrderHeader table.

This is what my table currently look like

CodePudding user response：

There are three things that can cause a unique constraint/index violation:

You are trying to insert a single row where the values defined for the columns in the constraint are the same in a row that already exists in the target.
You are trying to insert multiple rows that violate the constraint even though a row doesn't already exist in the target.
Both.

Let's say you have a table like:

CREATE TABLE dbo.OrderHeader
(
  WebQuoteNo varchar(32) NOT NULL UNIQUE,
  Date       date,
  OrderType  char(1),
  ListTotal  decimal(12,2),
  Discount   decimal(12,2)
);

The way you typically avoid 1. is:

INSERT dbo.OrderHeader
(
  WebQuoteNo, Date, OrderType, ListTotal, Discount
)
SELECT 
  WebQuoteNo, Date, OrderType, ListTotal, Discount
FROM dbo.SalesOrderData AS s
WHERE NOT EXISTS
(
  SELECT 1 
    FROM dbo.OrderHeader AS h WITH (UPDLOCK, SERIALIZABLE)
    WHERE h.WebQuoteNo = s.WebQuoteNo
);

Example db<>fiddle

If you also need to guard against 2., you need to figure out which duplicate you want to keep. You can use something like this:

;WITH src AS
(
  SELECT WebQuoteNo, /* other columns, */
    rn = ROW_NUMBER() OVER (PARTITION BY WebQuoteNo 
      ORDER BY <some other deterministic criteria, maybe DATE?>
  FROM dbo.SalesOrderData AS s
  WHERE NOT EXISTS
  (
    SELECT 1
      FROM dbo.OrderHeader AS h WITH (UPDLOCK, SERIALIZABLE)
      WHERE h.WebQuoteNo = s.WebQuoteNo
  )
  -- and other criteria
)
INSERT dbo.OrderHeader(WebQuoteNo /* , other columns */)
  SELECT WebQuoteNo /* , other columns */
  FROM src 
  WHERE rn = 1;

Before you even think about using some suggestion that flings MERGE over the wall as a solution to an INSERT problem without also talking about concurrency and isolation, please read all of these posts and to understand the purpose of the locking hints applied here see this post about upserts.

CodePudding user response：

Assumption 1: You have a process that PULL sales data, and that data cause violation of PK for sales header record

Assumption 2: Sales data most commonly put a record per item, since when you get a batch of records, those may contain duplicate of header

Assumption 3: Sales often do corrections and may send same record with new price (total) again. And again try to break constraint.

Assumption 4: Since you pulling batches this is nightly process and most likely during run users are not using system

Option 1: Will fix Assumption 2, use DISTINCT for inserted rows.

 SELECT DISTINCT * FROM (SELECT       
        WebQuoteNo, Date, OrderType, ListTotal, discount AS DiscountTotal, WebEmail
    FROM          
        SalesOrderData ) as a

Option 2: If you suffer from sales data update Assumption 3, you most likely can use MERGE, if Assumption 4 is correct.

The short how to is here.

https://www.mssqltips.com/sqlservertip/1704/using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time/

MERGE has following caveats that you should fully accept before trying:

Most likely you will need to fully lock table during update process, so if it is impacting users and that is unacceptable "MERGE" is not your option.
If you cannot fully lock table you will need to follow suggestion from Aaron Bertrand here https://sqlblog.org/merge and be super cautios.

In General MERGE work well for nightly process, during non-business hours.