I need to update table with more than 10s millions of rows (production). At the beginning there will be added 3 news columns col1, col2, col3
. These 3 news columns are going to have default value based on other columns prev_col1, prev_col2, prev_col3
. Old and new columns are var char
column with max length of 10, some of them may be null. What is the approximate time, i am asking, because i have never updated that many rows at one time.
Should i use simple update:
UPDATE table_name SET col1=prev_col1, col2=prev_col2, col3=prev_col3;
Or is any other efficient way for updating large amount of rows?
CodePudding user response:
Efficient ways to update would be the use of MySQL Join-Buffer-Size for updating large rows. e.g.
SET join_buffer_size = 1024 * 1024 * 42;
UPDATE table1 a JOIN table2 b
ON a.field1 = b.field1
SET
a.field2 = b.field2,
a.field3 = b.field3,
a.field4 = b.field4;
mysql> SET GLOBAL join_buffer_size = 1024 * 1024 * 42;
May be this can be of some help:- update-10-million-rows-in-mysql-single-table-as-fast-as-possible
CodePudding user response:
You can use a transaction to "test" run your query. But please don't run it directly on production, always work in a cloned version until you're 100% sure what you're doing.
START TRANSACTION;
UPDATE table_name SET col1 = prev_col1, col2 = prev_col2, col3 = prev_col3;
ROLLBACK;
This will run the query, but instead of committing it will rollback. Will tell you errors and the time in either the console or GUI you use.
CodePudding user response:
The execution time cannot be calculated in general. It depends on the way your server has stored the data (heap, tree ...) as well as the read/write-Speed of your hard-drive or solid state drive.
My professor at the university once said that you can find those informations at the SQL-Server-Activity-Monitor. We never tried it out but im pretty sure you can find some tutorials how to work with it.
Or you just try the Method from @Jelle. Thats a pretty safe way. Altough I recommend doing this over night or at a time the server isnt used that intense.
And another information - you should not care about the performance of your sql request - the ms sql server has a pretty decent optimization build in.