Home > Software design >  PostgreSQL: Efficiently aggregate a list query result into JSON object
PostgreSQL: Efficiently aggregate a list query result into JSON object

Time:05-11

I need to return a result from a query in which I match one row from a table, then aggregate in JSON from up to 500,000 text values that reference that row.

  • The JSON must be a map where the names are data points. It needs to be a key:value object, since then it will be serialized into a hash map in my golang backend, which will allow me to search it in constant time. I don't care what the values are empty strings or null are preferred to reduce transport and serialization time.

Initially I tried using jsonb_build_object but it was very slow. Here's a simplified version of the query:

SELECT
    table_a.id,
    table_a.name,
    table_a.description,
    table_a.created,
    table_a.createdby,
    table_a.modified,
    table_a.modifiedby,
    jsonb_build_object( -- the aggregate takes a very long time
        SELECT
            table_b.item
        FROM
            table_b
        WHERE
            table_a.id = table_b.table_a_id
    )
FROM
    table_a
WHERE 
    table_a.id = <some_id_input>;

My next step was to get an array, then convert it to JSON. This proved much more efficient:

SELECT
    table_a.id,
    table_a.name,
    table_a.description,
    table_a.created,
    table_a.createdby,
    table_a.modified,
    table_a.modifiedby,
    to_jsonb( ARRAY (
        SELECT
            table_b.item
        FROM
            table_b
        WHERE
            table_a.id = table_b.table_a_id
    ))
FROM
    table_a
WHERE 
    table_a.id = <some_id_input>;

Even though it's more efficient, it gives me a JSON array when I need a JSON object...

Is there an easy and efficient way in PostgreSQL 14 to produce what I need?

CodePudding user response:

Join to the aggregated result:

SELECT
    table_a.id,
    table_a.name,
    table_a.description,
    table_a.created,
    table_a.createdby,
    table_a.modified,
    table_a.modifiedby,
    b.items
FROM table_a
  LEFT JOIN (
    select table_a_id, jsonb_agg(table_b.item) as items
    FROM table_b
    GROUP by table_a_id
  ) b on b.table_a_id = table_a.id
WHERE 
    table_a.id = <some_id_input>;

Or maybe using a lateral join as you restrict this to a single table_a.id:

SELECT
    table_a.id,
    table_a.name,
    table_a.description,
    table_a.created,
    table_a.createdby,
    table_a.modified,
    table_a.modifiedby,
    b.items
FROM table_a
  LEFT JOIN LATERAL (
    select jsonb_agg(table_b.item) as items
    FROM table_b
    WHERE table_b.table_a_id = table_a.id
  ) b on true 
WHERE 
    table_a.id = <some_id_input>;

I am not sure, but I wouldn't be surprised if using a native array (array_agg()) is more efficient than JSON to create an array of 500.000 items

  • Related