Home > OS >  flag matching pos/neg pairs in a table with duplicates
flag matching pos/neg pairs in a table with duplicates

Time:03-03

Any advice on how I can delete or flag corresponding pairs of rows from a table that has duplicate rows? (Yes, it’s ugly; that’s the data I’m given.)

Consider a services-charge table: < customer #, date, item #, quantity, charge >

My data has these unfortunate features:

  • No unique id. Duplicate rows. (Two identical charges on the same day result in two identical rows. We care about count() in addition to sum(), so two quantity-one rows are not the same as one row of quantity-two.)
  • If a charge is cancelled, the original row is left unchanged; a similar row gets appended, with negative quantity and charge.

For example, we could have the following rows, reflecting a service performed twice in a day for the same customer, and where the original entries were backed out and re-entered. (Perhaps there was a problem in fields we don’t see.)

43, 3/1/2022, 17, 1, $5
43, 3/1/2022, 17, 1, $5     two charges
43, 3/1/2022, 17, -1, -$5
43, 3/1/2022, 17, -1, -$5   both charges reversed
43, 3/1/2022, 17, 1, $5
43, 3/1/2022, 17, 1, $5     both charges entered again

Like I said, the data is ugly! I want to delete (or flag) the positive-negative pairs. (In the above sample, delete the first 4 rows. Or the last 4, since they’re interchangeable.)

I’ve implemented this in Excel VBA, but would much rather have a non-VBA SQL solution.

Is this do-able? (It's easy to compute a list of all the distinct matching pairs. Adding an ID field to the table, probably one could delete a single positive/negative pair for each entry in that list. But even so, you'd need to run the query twice for the above example, where two pairs match the same entry. And I'm not even clear how to get that far.)

We’re running Microsoft Access in Office 365.

CodePudding user response:

Maybe you can just send the valid data to another table (copy the structure of the old one –adding a PK would be a good idea). Something like


Insert into new_table

Select customer, date, item, sum(quantity) as q, sum(charge) as c

From old_table

CodePudding user response:

You are right, Mugwort. I’ve thought again about your problem, and I propose you an alternative solution. An auxiliary table is needed. This table has to be filled with integers from 1 to as many rows you require in your real data.

CREATE TABLE rows(n int NULL)  ON [PRIMARY]
insert into rows values(1)
insert into rows values(2)
insert into rows values(3)
insert into rows values(4)
insert into rows values(5)
insert into rows values(6)
insert into rows values(7)
insert into rows values(8)
insert into rows values(9)
insert into rows values(10)

I’ve created a dbfiddle post with a full example. There are some extra data just to validate the solution. Sample is in SQL Server, but it works as well in ms-access as long as the table and field names are the same (in ms-access you have to create a query).

SELECT q.customer, q.Date, q.item, [SumOfquantity]/q.n AS quantity, [SumOfcharge]/q.n AS charge
FROM 
    (SELECT data.customer, data.date, data.item, Sum(data.quantity) AS SumOfquantity, Sum(data.charge) AS SumOfcharge, Sum(IIf([quantity]<0,-1,1)) AS [n]
    FROM data
    GROUP BY data.customer, data.date, data.item)  AS q
    ,[rows]
WHERE rows.n <= q.n
ORDER BY q.customer, q.Date, q.item;

The solution is a bit tricky, but I think it should work with your data.

  • Related