I have a MySQL table with 400 thousand records
CREATE TABLE `TABLE1` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`NAME` varchar(255) NOT NULL,
`VALUE` varchar(255) NOT NULL,
`UID` varchar(255) NOT NULL,
`USER_ID` varchar(255) DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `ukey1` (`VALUE`,`NAME`,`UID`),
UNIQUE KEY `ukey2` (`UID`,`NAME`,`VALUE`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
The following query is very slow and takes hours and finally fails
delete from TABLE1 where UID in ( uid1,uid2,...uid45000) and USER_ID is not null and USER_ID <> '12345';
where the list of uids are returned by a subquery.
Visual explain shows the following and adding index on UID not helping. How to optimize the performance of this query?
CodePudding user response:
Change it to a JOIN.
DELETE t1
FROM TABLE1 AS t1
JOIN (SELECT uid FROM ...) AS t2 ON t1.uid = t2.uid
WHERE USER_ID is not null and USER_ID <> '12345';
I've found that MySQL implements WHERE uid IN (subquery)
very poorly sometimes. Instead of getting all the results of the subquery and looking them up in the index of the table, it scans the table and performs the subquery for each row, then checks if the uid is in that result.
CodePudding user response:
First of all make a backup of that table this is the first rule for doing a delete queries or you can ruin it and take all the precautions that you considere before
( uid1,uid2,...uid45000)
What is the meaning of those values between the parenthesis ? Are you need to compare in the list all the UID values or some of them?
beacause you can avoiding put all the UIDS manually like this.
delete from TABLE1 where UID in (SELECT T.UID FROM TABLE1 as T where T.UID is not NULL and USER_ID <> '12345');
Before to doing this please check what do you want between the parenthesis and run the command in a TEST environment first with dummy values
Take in consideration that you have in the table varchars types in the UIDS field and thats the reason that this operation take a lot of time more than if you are using integer values
The other way is that you need to create a new table and put the data that you need to store for the old table, next truncate the original table and reinsert the same values of the new table to the old table again
Please before to run a solution check all your restrictions with your teamates and make a test with dummy values
CodePudding user response:
I would split your uid filter list in chunks (100 by chunk or other, need to test) and iterate or multithread over it