I'm trying to make bulk insert operation in SQLAlchemy with Oracle DB, which inserts 60k rows with blob data.
This is how my table and code looks like:
CREATE TABLE datatables (
id INTEGER NOT NULL,
table_name VARCHAR2(50 CHAR),
row_id VARCHAR2(50 CHAR),
row_data BLOB,
PRIMARY KEY (id)
)
with Session() as session:
session.execute(
DataTables.__table__.insert(),
datas
)
Where DataTables
is SQLAclhemy class mapping for table and datas
is list of dicts like this {'id': 1, 'table_name': 'app', 'row_id': 'version', 'row_data': '....'}
with such code i got this sql statement
sqlalchemy.engine.Engine INSERT INTO datatables (id, table_name, row_id, row_data) VALUES (:id, :table_name, :row_id, :row_data)
sqlalchemy.engine.Engine [generated in 0.14718s] [{'id': 1, 'table_name': 'app', 'row_id': 'version', 'row_data': b'some_data'}, ...]
and it runs forever, even after 30 minutes it is not finished. When I enable DPI trace there are a lot of such rows:
ODPI [12912] 2021-12-18 00:07:36.019: ref 0000019C3D15C5B0 (dpiConn) -> 8
ODPI [12912] 2021-12-18 00:07:36.020: ref 0000019C413F0C20 (dpiLob) -> 1 [NEW]
ODPI [12912] 2021-12-18 00:07:36.020: ref 0000019C3D15C5B0 (dpiConn) -> 9
ODPI [12912] 2021-12-18 00:07:36.020: ref 0000019C413F1990 (dpiLob) -> 1 [NEW]
However if I use raw sql:
session.execute('insert into DATATABLES (id, table_name, row_id, row_data) values (:id, :table_name, :row_id, :row_data)', datas)
DPI trace changed to this:
ODPI [00796] 2021-12-18 00:14:55.246: ref 000002486741EAF0 (dpiVar) -> 0
ODPI [00796] 2021-12-18 00:14:55.246: ref 00000248617D2DF0 (dpiConn) -> 6
ODPI [00796] 2021-12-18 00:14:55.246: fn end dpiVar_release(000002486741EAF0) -> 0
ODPI [00796] 2021-12-18 00:14:55.246: fn start dpiVar_setFromBytes(000002486741EBB0)
ODPI [00796] 2021-12-18 00:14:55.247: fn end dpiVar_setFromBytes(000002486741EBB0) -> 0
ODPI [00796] 2021-12-18 00:14:55.247: fn start dpiVar_setFromBytes(0000024864E5FDE0)
ODPI [00796] 2021-12-18 00:14:55.247: fn end dpiVar_setFromBytes(0000024864E5FDE0) -> 0
ODPI [00796] 2021-12-18 00:14:55.247: fn start dpiVar_setFromBytes(0000024864E601A0)
ODPI [00796] 2021-12-18 00:14:55.247: fn end dpiVar_setFromBytes(0000024864E601A0) -> 0
and bulk insert is done in 15 seconds.
SQL statement is same in both cases. Why does it work diffent? Of course I can use raw sql, but I want to use DataTables.__table__.insert()
, because if table or columns names will change - I don't need to fix sql for this every time.
CodePudding user response:
issues like these should preferably be reported as bugs in SQLAlchemy, which you can do at : https://github.com/sqlalchemy/sqlalchemy/issues/ otherwise we were completely not aware of this problem.
in this case we've been alerted to this issue and will seek to update our LOB settings per cx_oracle developers for SQLAlchemy 2.0, which is tracked at https://github.com/sqlalchemy/sqlalchemy/issues/7494