Home > Blockchain >  mysql: Search for records where two columns match another record and specify conditions on both reco
mysql: Search for records where two columns match another record and specify conditions on both reco

Time:03-11

I have a recurring_bill table which describes a subscription transaction. I have some duplicates that have the same customerId and chargeId. I want to identify these duplicate records. I also want to make sure IsDeleted is false for both records and the LastDay is in the future or NULL (so its an ongoing charge). My current query shows records that do not have a duplicate. Please help me correct my query.

SELECT * FROM recurring_bills b1
WHERE EXISTS 
  (SELECT * FROM recurring_bill b2 WHERE b1.customerId = b2.customerId
    AND b1.chargeId = b2.chargeId
    AND (b2.LastDay > '2022-03-10' OR b2.LastDay IS NULL)
    AND b2.IsDeleted = 0) 
  AND (b1.LastDay > '2022-03-10' OR b1.LastDay IS NULL) AND b1.IsDeleted = 0;

Lets say this is the input

customerId    |    chargeId       |    LastDay    |    IsDeleted
1             |    charge1        |    NULL       |        0
1             |    charge1        |    05-23-2022 |        0 
2             |    charge2        |    05-23-2022 |        0
2             |    charge2        |    05-23-2021 |        0
3             |    charge3        |    NULL       |        1
3             |    charge3        |    NULL       |        0

The correct output would be

customerId    |    chargeId       |    LastDay    |    IsDeleted
1             |    charge1        |    NULL       |        0

MySQL version is 5.5.59

CodePudding user response:

Try using EXISTS to identify records with the same customerId, chargeId, etc.. but having a different unique record Id. (I didn't know the name of your column, so used "TheUniqueId" for the example).

Note, when mixing AND/OR operators you must use parentheses to ensure expressions are evaluated in the expected order. Otherwise, the query may return the wrong results.

That said, wouldn't you want to see all of the "duplicates", so you could take action on them, if needed? If so, try:

CREATE TABLE recurring_bills
    (`theUniqueId` int auto_increment primary key
     , `customerId` int
     , `chargeId` varchar(7)
     , `LastDay` date
     , `IsDeleted` int
)
;
    
INSERT INTO recurring_bills
    (`customerId`, `chargeId`, `LastDay`, `IsDeleted`)
VALUES
    (1, 'charge1', NULL, 0),
    (1, 'charge1', '2022-05-23', 0),
    (2, 'charge2', '2022-05-23', 0),
    (2, 'charge2', '2021-05-23', 0),
    (3, 'charge3', NULL, 1),
    (3, 'charge3', NULL, 0)
;
-- Show all duplicates
SELECT * 
FROM   recurring_bills dupe
WHERE dupe.IsDeleted = 0
AND ( dupe.LastDay > '2022-03-10' OR
      dupe.LastDay IS NULL 
    )
AND EXISTS (
    SELECT NULL
    FROM   recurring_bills b
    WHERE b.customerId = dupe.customerId
    AND   b.chargeId   = dupe.chargeId 
    AND   b.IsDeleted  = dupe.IsDeleted 
    AND   b.theUniqueId <> dupe.theUniqueId
    AND ( b.LastDay > '2022-03-10' OR 
          b.LastDay IS NULL 
        )
)   

Results:

theUniqueId | customerId | chargeId | LastDay    | IsDeleted
----------: | ---------: | :------- | :--------- | --------:
          1 |          1 | charge1  | null       |         0
          2 |          1 | charge1  | 2022-05-23 |         0

If for some reason you really want a single record per chargeId customerId, add a GROUP BY

-- Show single record per dupe combination
SELECT * 
FROM   recurring_bills dupe
WHERE dupe.IsDeleted = 0
AND ( dupe.LastDay > '2022-03-10' OR
      dupe.LastDay IS NULL 
    )
AND EXISTS (
    SELECT NULL
    FROM   recurring_bills b
    WHERE b.customerId = dupe.customerId
    AND   b.chargeId   = dupe.chargeId 
    AND   b.IsDeleted  = dupe.IsDeleted 
    AND   b.theUniqueId <> dupe.theUniqueId
    AND ( b.LastDay > '2022-03-10' OR 
          b.LastDay IS NULL 
        )
)   
 
GROUP BY dupe.customerId, dupe.chargeId

Results:

theUniqueId | customerId | chargeId | LastDay | IsDeleted
----------: | ---------: | :------- | :------ | --------:
          1 |          1 | charge1  | null    |         0

db<>fiddle here

  • Related