Today I was shocked one of my very valuable tables in a mysql DB was almost wiped out, and I couldn't yet understand if it was my mistake or someone else did it due to some security vulnerability.
Anyway, I have a script to do daily backups of the entire mysql database into a .sql.gz
file, via mysqldump
. I have hundreds of those files and I want to check what is the exact day where that table was wiped out.
Can I do a sort of a COUNT
to a table, but from a .sql
file?
CodePudding user response:
No, there is no tool to query the .sql
file as if it's a database. You have to restore that dump file, then you can query its data.
A comment above suggests to count the INSERT statements in the dump files, but that isn't reliable, because by default mysqldump outputs multiple rows per INSERT statement (the --extended-insert option, which is enabled by default). The number of rows per INSERT varies, depending on the length of data.
I once had to solve a problem exactly like yours. A bug in my app caused some rows to vanish, but we didn't know exactly when, because we didn't notice the discrepancy until some days after it happened. We wanted to know exactly when it happened so we could correlate it to other logs and find out what caused the bug. I had daily backups, but no good way to compare them.
Here's how I solved it:
I had to restore every daily backup to a temporary MySQL instance in my development environment. Then I wrote a Perl script to dump all the integer primary key values from the affected table, so each id value corresponded to a pixel in a GIF image. If a primary key value was found in the table, I drew a white pixel in the image. If the primary key value was missing, I drew a black pixel in the position for that integer value.
The image filenames are named for the date of the backup they represent. I repeated the process for each day's backup, writing to a new image.
Then I used an image preview app to scroll through my collection of images slowly using the scroll wheel of my mouse. As expected, each image had a few more pixels than the image before, representing records were added to the database each day. Then at some point, the data loss event happened, and the next image had a row of black pixels where the previous day had white pixels. I could therefore identify which day the data loss occurred on.
After I identified the last backup that contained the data before it was dropped, I exported the rows that I needed to restore to the production system.