Home > Back-end >  Counting duplicate, triplicate, etc. orders that can have multiple lines per order
Counting duplicate, triplicate, etc. orders that can have multiple lines per order

Time:12-21

We have a table of orders that can have multiple lines per order. There are many occasions that duplicate, triplicate, etc. orders can be submitted.

I am using Microsoft SQL Server Management Studio for querying the table. I am no expert in T-SQL and I am having trouble coming up with a query to show the count for the number of duplicate/triplicate orders where BotReady = 'Y'.

Here is an example of the way our orders are stored in our table (dbo.OrderTable). The PurchaseOrderNumber is NOT unique so therefore we use PurchaseOrderNumber and AccountNumber as a super key.

BotReady PurchaseOrderNumber AccountNumber LineNumber Quantity UnitPrice RequestedUOM Style Color ProductType
Y 217344 0026985 1 30 5.40 EA hwwtg 11028 H
Y 217344 0026985 2 10 7.99 SQF lr105 00114 C
Y 217344 0026985 1 30 5.40 EA hwwtg 11028 H
Y 217344 0026985 2 10 7.99 SQF lr105 00114 C
N 217344 0026985 1 30 5.40 EA hwwtg 11028 H
N 217344 0026985 2 10 7.99 SQF lr105 00114 C
Y BD624919 0210995 1 158 12.80 SHT 551MR 00122 H
Y BD624919 0210995 1 158 12.80 SHT 551MR 00122 H
Y BD624919 0210995 1 158 12.80 SHT 551MR 00122 H
Y BD624920 0210995 1 12 3.80 SQY 211MX 00132 H
N BD624920 0210995 2 12 0.99 EA HA258 01088 H
Y 1269407 1911403 1 55 12.99 BOX HSMPR 00150 H
Y 1269407 1911403 2 2 200.00 ROL TGJ88 01088 C
Y 1269407 1911403 3 1 10.00 EA 00000 00001 R
Y 1269407 1911403 1 55 12.99 BOX HSMPR 00150 H
Y 1269407 1911403 2 2 200.00 ROL TGJ88 01088 C
Y 1269407 1911403 3 1 10.00 EA 00000 00001 R

The SQL should return 4 since there is 1 duplicate order for PurchaseOrderNumber = 217344, 2 duplicates for BD624919, and 1 duplicate for 1269407. So since each order can contain multiple lines per order that can affect how duplicates are counted.

For example, since PurchaseOrderNumber 1269407 has 3 lines in the order that appear twice it still only counts as 1 duplicate order (even if 2 of the 3 lines were duplicated) since they are all apart of the same order.

CodePudding user response:

COUNT and MAX in sub-queries would do the trick:

If you want only number of duplicates to be displayed as 4, the query would be:

select sum(tot) AS FINAL 
from
    (
    select PurchaseOrderNumber, max(cnt) as tot
    from
    (
        select 
             BotReady, 
             PurchaseOrderNumber,
             AccountNumber,
             Linenumber,
             count(*)-1 as cnt
        from t
        where botready='Y'
        group by 
             BotReady, 
             PurchaseOrderNumber,
             AccountNumber,
             Linenumber
             having count(*)>1
    ) as q
group by PurchaseOrderNumber
) as h

If you want which PurchaseOrderNumber had how many duplicates, the query would be:

select PurchaseOrderNumber, max(cnt) as tot
from
(
    select 
         BotReady, 
         PurchaseOrderNumber,
         AccountNumber,
         Linenumber,
         count(*)-1 as cnt
    from t
    where botready='Y'
    group by 
         BotReady, 
         PurchaseOrderNumber,
         AccountNumber,
         Linenumber
         having count(*)>1
) as q
group by PurchaseOrderNumber

Not much different, just final SUM is omitted.

Above queries would handle even if some of linenumbers are not repeated. Try that in fiddle link below.

see DEMO in db<>fiddle

CodePudding user response:

If i undertand well you want to detect duplicate in your table, the same data on all columns, this query can help you :

SELECT BotReady
    ,PurchaseOrderNumber
    ,AccountNumber
    ,LineNumber
    ,Quantity
    ,UnitPrice
    ,RequestedUOM
    ,Style
    ,Color
    ,ProductType
    ,COUNT(1)
FROM dbo.OrderTable
WHERE BotReady = 'Y'
GROUP BY BotReady
   ,PurchaseOrderNumber
   ,AccountNumber
   ,LineNumber
   ,Quantity
   ,UnitPrice
   ,RequestedUOM
   ,Style
   ,Color
   ,ProductType
HAVING COUNT(1) > 1
  • Related