I have this MySQL table, with a size of about 120GB:
CREATE TABLE `impressions` (
`session_uuid` varchar(36) DEFAULT NULL,
`survey_uuid` varchar(255) NOT NULL,
`data` text,
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`user_uuid` varchar(255) NOT NULL DEFAULT '',
`is_test` tinyint(1) NOT NULL DEFAULT '0',
KEY `impressions_survey_uuid_session_uuid_user_uuid_index` (`survey_uuid`,`session_uuid`,`user_uuid`),
KEY `impressions_created_at_index` (`created_at`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
I see that this data migration is taking more than 6 hours (on a decent RDS instance where I was able to run more complex migrations) because is doing lots of I/O operations. Why does it have to do so many operations? The only thing I'm changing here is the NULL optionality and the default value.
ALTER TABLE `impressions` CHANGE COLUMN `user_uuid` `user_uuid` VARCHAR(255) null;
CodePudding user response:
Changing the nullability of a column changes the structure of the row in InnoDB. A row stored in InnoDB has a bitmap for each nullable column, to indicate whether that column is in fact NULL on the given row. If you change the nullability of a column, you have changed the length of the bitmap. Therefore every row must be rewritten to a new set of pages.
Changing only the DEFAULT is a metadata-only change.
I've made the mistake of running an ALTER TABLE that should've been a metadata-only change, but I forgot to match the nullability of the original column, and so my ALTER TABLE became a table restructure and took a long time.
If you have to do such changes in MySQL, I suggest you look at one of the open-source online schema change tools: pt-online-schema-change or gh-ost. I've used the former tool to manage many long-running schema changes in production. It usually makes the operation take a little bit longer, but that's not a problem because the table can still be used for both reading and writing while the schema change is in progress.