Home > other >  Asyncpg: UPDATE with UNNEST much slower than using FROM VALUES
Asyncpg: UPDATE with UNNEST much slower than using FROM VALUES

Time:01-15

I have the following table:

create table foo (id serial primary key, ts date, value int);

Populated with 10k records already. Now I'm trying to find a way to efficiently update multiple rows in one go, so I wrote this code to compare two approaches (references: here for the first method, here for the second):

import asyncpg
import asyncio
import time

async def main():
    c = await asyncpg.connect(user='foo', password='bar', database='foodb', host='192.168.100.1')

    # prepare data
    arr = []
    arr2 = [[], []]
    for i in range(10001):
        row = (i 1, i)
        arr.append(row)
        arr2[0].append(row[0])
        arr2[1].append(row[1])


    # using FROM VALUES
    values = ",".join([ f"{i}" for i in arr ])
    q = f"UPDATE foo SET value = v.value FROM(VALUES {values}) AS v(id, value) WHERE v.id = foo.id;"
    start_time = time.time()
    await c.execute(q)
    print(f"from values: {time.time() - start_time}")

    # using UNNEST
    q = "UPDATE foo SET value = v.value FROM ( SELECT * FROM UNNEST($1::int[], $2::int[])) AS v(id, value) WHERE foo.id = v.id;"
    start_time = time.time()
    await c.execute(q, *arr2)
    print(f"unnest: {time.time() - start_time}")

if __name__ == '__main__':
    asyncio.run(main())

The first method is consistently much faster than the second. The thing I don't like about the first method is that (as far as I can tell - corrections welcome) it cannot be used with $x placeholders for the VALUES part, so one is forced to manually build the string (which is also unsafe from an SQL injection point of view I believe).

So my questions:

  • can VALUES be built dynamically and used with placeholders?
  • is UNNEST really slower or am I doing something wrong?

CodePudding user response:

I didn't profile UNNEST performance, but as for your 1st question (regarding $x phaceolders) con.executemany is a proper choice for multiple updates:

q = "UPDATE foo SET value = $2 WHERE id = $1;"
await c.executemany(q, arr)

CodePudding user response:

When you are unable to bind arrays, my work around is to dynamically build the bind tokens, not the whole substitution...

A rough and ready example being...

arr = []
for i in range(10001):
  arr.append(i 1)
  arr.append(i)

tokens = ",".join([ "(?,?)" for i in range(10001) ]
q = f"UPDATE foo SET value = v.value FROM(VALUES {tokens}) AS v(id, value) WHERE v.id = foo.id;"

await c.execute(q, *arr)
  • Related