I have a User
table to store unique users, unfortunately something went wrong in my code and the same user kept inserted into table when user update profile.
For instance, User John Doe
was created multiple times and the ID is 1, 2, 3 and 4. In table Order
the same person placed multiple orders and UID
refers to the user in table User
and UID 1 and 4 both refers to John Doe.
I am trying to
- Remove duplicates in
User
- Keep the minimum
UID
inOrder
if duplicate User ID are used inOrder
. For example both UID 1 and 4 are used inOrder
, I want to update 4 to 1.
Table User
:
---- ------ -----
| ID | Name | Gender |
---- ------ -----
| 1 | John Doe | M |
| 2 | John Doe | M |
| 3 | John Doe | M |
| 4 | John Doe | M |
---- ------ -----
Table Order
:
---- ---- ------ -----
|ID | UID | BILLING_ADDR | STATE |
---- ---- ------ -----
|1 | 1 | XXX | XX |
|2 | 4 | XXX | XX |
|3 | 4 | XXX | XX |
|4 | 4 | XXX | XX |
---- ---- ------ -----
To find the duplicates I have:
SELECT Name, count(1)
FROM User
GROUP BY Name
HAVING count(1) > 1
ORDER BY count(1) DESC;
To remove duplicates I have:
DELETE FROM `User`
WHERE ID NOT IN (
SELECT * FROM (
SELECT MIN(ID) FROM User
GROUP BY Name
)
);
Expected Table User
and Order
:
---- ------ -----
| ID | Name | Gender |
---- ------ -----
| 1 | John Doe | M |
---- ------ -----
---- ---- ------ -----
|ID | UID | BILLING_ADDR | STATE |
---- ---- ------ -----
|1 | 1 | XXX | XX |
|2 | 1 | XXX | XX |
|3 | 1 | XXX | XX |
|4 | 1 | XXX | XX |
---- ---- ------ -----
Unfortunately I don't know how to update the UID in table Order
, can someone help? Thanks!
CodePudding user response:
The whole scenario as I would do it (I use temporary tables in order to not mess up my database, but it works even better with permanent tables):
-- initial scenario, the two tables:
CREATE LOCAL TEMPORARY TABLE
usr(id,"name",gender)
ON COMMIT PRESERVE ROWS AS (
SELECT 1,'John Doe','M'
UNION ALL SELECT 2,'John Doe','M'
UNION ALL SELECT 3,'John Doe','M'
UNION ALL SELECT 4,'John Doe','M'
)
;
CREATE LOCAL TEMPORARY TABLE
ord(id,uid,billing_addr,state)
ON COMMIT PRESERVE ROWS AS (
SELECT 1, 1,'XXX','XX'
UNION ALL SELECT 2, 4,'XXX','XX'
UNION ALL SELECT 3, 4,'XXX','XX'
UNION ALL SELECT 4, 4,'XXX','XX'
)
;
-- de-dupe the usr table using an insert.. .select into yet another table:
CREATE LOCAL TEMPORARY TABLE
usr_dedup
ON COMMIT PRESERVE ROWS AS
WITH
w_rowcount AS (
SELECT
id
, "name"
, gender
, ROW_NUMBER() OVER(PARTITION BY "name" ORDER BY id) AS rn
FROM usr
)
SELECT
id
, "name"
, gender
FROM w_rowcount
WHERE rn=1;
-- Using the duped and the deduped usr tables to update the old table:
UPDATE ord SET uid = (
SELECT
d.id
FROM (
usr u
JOIN usr_dedup d USING("name")
)
WHERE u.id=ord.uid
);
-- "ord" now contains:
SELECT * FROM ord;
id | uid | billing_addr | state
---- ----- -------------- -------
1 | 1 | XXX | XX
2 | 1 | XXX | XX
3 | 1 | XXX | XX
4 | 1 | XXX | XX
-- now rename the two usr tables to have the usr table with de-duped records:
ALTER TABLE usr RENAME TO usr_dup;
ALTER TABLE
ALTER TABLE usr_dedup RENAME TO usr;
ALTER TABLE