Home > database >  handle huge list of insert statement
handle huge list of insert statement

Time:10-21

I have a huge list of insert statements in a file as shown below, I need to use stored procedure or liquibase script to execute these statement in production database. If any of the record(s) gets failed to insert then everything should be rolled back. Also performance is important here.

insert into employee (empid, empname, slary) values (1, 'bar', 2000);
insert into employee (empid, empname, slary) values (2, 'foo', 2000);
.
.
insert into employee (empid, empname, slary) values (100000, 'baz', 2000);

CodePudding user response:

To improve performance:

  • Disable or drop any indexes and constraints on the employees table. You can recreate them afterwards.
    • Note: this could be dangerous if your data is not sorted. Putting back an index could take a long time.
  • Lock the table when inserting
  • Insert in batches
  • Check out the Oracle utility for bulk operations

For rollbacks, you can use the ROLLBACK transaction

CodePudding user response:

Insert 100 rows-at-a-time using the UNION ALL trick and wrap the entire code with a BEGIN and END. This is a little trickier to generate than a simple list of inserts, but batching the operations reduces the row-by-row overhead and usually improves performance by around 10x.

begin
    insert into employee (empid, empname, slary)
        select 1, 'bar', 2000 from dual union all
        select 2, 'foo', 2000 from dual union all
        ...
        select 100, 'foobar', 2000 from dual;

    ...

    insert into employee (empid, empname, slary)
        ...
        select 100000, 'baz', 2000 from dual;
end;
/

I use 100 rows because if you batch all of the rows into a single INSERT Oracle may take too long to parse the enormous SQL statement. Using the BEGIN and END has Oracle pass the entire block at once over the network, and also ensures that a single failure in one statement will rollback everything else.

You don't need direct path, /* append */ hints, dropping and recreating indexes, locking, SQL Loader, or any of those fancy features. Those are good ideas for a huge amount of data. But you've actually got a small amount of data, just a huge number of statements. You need to optimize for network lag and SQL parsing - the actual time to load the data is likely irrelevant.

  • Related