Home > Back-end >  Massively deleting SQL rows "skipping" valid foreign key references
Massively deleting SQL rows "skipping" valid foreign key references

Time:10-26

Due to some error that happened in the past, I have a files table (that contains references to where the files are stored on the file system) that has many entries for the same actual file. So for example

 id |      path      |   name
---- ---------------- -----------
  1 | '/tmp/images/' | 'foo.jpg'
  2 | '/tmp/images/' | 'foo.jpg'
  3 | '/tmp/images/' | 'foo.jpg'
  4 | '/tmp/images/' | 'bar.jpg'
  5 | '/tmp/images/' | 'bar.jpg'
  6 | '/tmp/images/' | 'baz.jpg'

and so on. Those duplicates are, in reality, in the order of 2000 rows for each actual file, so my table is actually much bigger than what it should be, for no actual good reason.

The id column is used as a reference for foreign keys in my database, in various tables. For example, my products table has thumb_id and cover_id that both reference the id field of the files table, then the users table has a profile_pic_id that also points to the id field of the files table, and so on.

Now, knowing that (due to a bug in the way those files were imported into my system), all the duplicates are related to one specific filesystem path, I would like to do something like

DELETE FROM files WHERE path = '/tmp/folder-with-duplicates'

while somehow instructing MySQL to not touch the rows that are actually "in use" (as in, referenced as a foreign key in some other table).

I know that I can add not in conditions to my query to check that (I already looked at questions such as this one), but this table is referenced by many other tables and figuring out all the associations and writing every single not in condition would be extremely tedious. Of course I cannot disable the constraint because my goal is to keep (only) those lines that are being referenced, so I was wondering if there was some way to instruct MySQL to delete "gracefully", or, to put it another way, to ignore the rows that would trigger the foreign key constraint in the delete query.

CodePudding user response:

Make sure none of the foreign keys have the ON DELETE CASCADE option. Then use the IGNORE option to DELETE to skip deleting rows that would cause errors due to the missing foreign key.

DELETE IGNORE FROM files WHERE path = '/tmp/folder-with-duplicates';

CodePudding user response:

If you have your foreign key set up properly, it won't allow you to delete the rows referenced by other tables. By default mySQL uses the "Restrict/No Action" on delete rule if it's not specified on the FK, which prevents deleting any rows that are referenced in an FK relationship, so your FK would prevent deleting any row in files where there's a matching reference from thumb_id if set up like this:

FOREIGN KEY (thumb_id) REFERENCES files(id) ON DELETE RESTRICT

If you have it set up differently (you shouldn't for this kind of use) with ON DELETE CASCADE or ON DELETE SET NULL then you need to drop the existing FK relationship and create a new one using ON DELETE RESTRICT to replace it. Once you do do your delete as normal, with IGNORE so it just skips the rows that error out - because of the FK rules.

  • Related