I have the following table:
create table foo (id serial primary key, ts date, value int);
Populated with 10k records already. Now I'm trying to find a way to efficiently update multiple rows in one go, so I wrote this code to compare two approaches (references: here for the first method, here for the second):
import asyncpg
import asyncio
import time
async def main():
c = await asyncpg.connect(user='foo', password='bar', database='foodb', host='192.168.100.1')
# prepare data
arr = []
arr2 = [[], []]
for i in range(10001):
row = (i 1, i)
arr.append(row)
arr2[0].append(row[0])
arr2[1].append(row[1])
# using FROM VALUES
values = ",".join([ f"{i}" for i in arr ])
q = f"UPDATE foo SET value = v.value FROM(VALUES {values}) AS v(id, value) WHERE v.id = foo.id;"
start_time = time.time()
await c.execute(q)
print(f"from values: {time.time() - start_time}")
# using UNNEST
q = "UPDATE foo SET value = v.value FROM ( SELECT * FROM UNNEST($1::int[], $2::int[])) AS v(id, value) WHERE foo.id = v.id;"
start_time = time.time()
await c.execute(q, *arr2)
print(f"unnest: {time.time() - start_time}")
if __name__ == '__main__':
asyncio.run(main())
The first method is consistently much faster than the second.
The thing I don't like about the first method is that (as far as I can tell - corrections welcome) it cannot be used with $x
placeholders for the VALUES
part, so one is forced to manually build the string (which is also unsafe from an SQL injection point of view I believe).
So my questions:
- can
VALUES
be built dynamically and used with placeholders? - is
UNNEST
really slower or am I doing something wrong?
CodePudding user response:
I didn't profile UNNEST
performance, but as for your 1st question (regarding $x
phaceolders) con.executemany
is a proper choice for multiple updates:
q = "UPDATE foo SET value = $2 WHERE id = $1;"
await c.executemany(q, arr)
CodePudding user response:
When you are unable to bind arrays, my work around is to dynamically build the bind tokens, not the whole substitution...
A rough and ready example being...
arr = []
for i in range(10001):
arr.append(i 1)
arr.append(i)
tokens = ",".join([ "(?,?)" for i in range(10001) ]
q = f"UPDATE foo SET value = v.value FROM(VALUES {tokens}) AS v(id, value) WHERE v.id = foo.id;"
await c.execute(q, *arr)