Home > Mobile >  Select row with all related child rows as array in one query
Select row with all related child rows as array in one query

Time:07-16

I'm using Postgres (latest) with node (latest) PG (latest). Some endpoint is receiving json which looks like:

{
   "id": 12345,
   "total": 123.45,
   "items": [
       {
           "name": "blue shirt",
           "url": "someurl"
       },
       {
           "name": "red shirt",
           "url": "someurl"
       }
   ]
}

So I'm storing this in two tables:

CREATE TABLE orders (
    id INT NOT NULL,
    total NUMERIC(10, 2) DEFAULT 0 NOT NULL,
    PRIMARY KEY (id)
);
CREATE INDEX index_orders_id ON orders(id);

CREATE TABLE items (
    id BIGSERIAL NOT NULL,
    order_id INT NOT NULL,
    name VARCHAR(128) NOT NULL,
    url VARCHAR(128) DEFAULT '' NOT NULL,

    PRIMARY KEY (id),
    FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE
);
CREATE INDEX index_items_id ON items(id);

The items table has a FK of order_id to relate the id of the order to its respective items. Now, the issue is I almost always need to fetch the order along with the items.

How do I get an output similar to my input json in one query? I know it can be done in two queries, but this pattern will be all over the place and needs to be efficient. My last resort would be to store the items as JSONB column directly in the orders table, but then if I need to query on the items or do joins with them it won't be as easy.

CodePudding user response:

For best efficiency, you might consider caching the orders in a memory cache (i.e. language builtins like APC, or 3rd party like memcache, redis, ..)

When you do this, there is no need to ask the DB server to open both tables for each request / query.

If you still want to retrieve both simultaneously, you can use a JOIN:

SELECT *
FROM items i
INNER JOIN orders o ON o.id = i.order_id
WHERE i.order_id = ?

But by doing this

  1. you delegate the problem to the DB server. The DB also has caches, but you don't need to delegate all work to the DB every time.
  2. the order data is always passed around, with every item returned. This costs memory and bandwidth.

I suggest you go with caching on the app-server.

CodePudding user response:

One of many ways:

SELECT jsonb_pretty(
          to_jsonb(o.*)  -- taking whole row
      || (SELECT jsonb_build_object('items', jsonb_agg(i))
          FROM  (
             SELECT name, url  -- picking columns
             FROM   items i
             WHERE  i.order_id = o.id
             ) i
         )
       )
FROM   orders o
WHERE  o.id = 12345;

This returns formatted text similar to the displayed input. (But keys are sorted, so 'total' comes after 'items'.)

If an order has no items, you get "items": null.

For a jsonb value, strip the jsonb_pretty() wrapper.

I chose jsonb for its additional functionality - like the jsonb || jsonbjsonb operator and the jsonb_pretty() function.

Related:

If you want a json value instead, you can cast the jsonb directly (without format) or the formatted text (with format). Or build a json value with rudimentary formatting directly (faster):

SELECT row_to_json(sub, true)
FROM  (
   SELECT o.*
       , (SELECT json_agg(i)
          FROM  (
             SELECT name, url  -- pick columns to report
             FROM   items i
             WHERE  i.order_id = o.id
             ) i
         ) AS items
   FROM   orders o
   WHERE  o.id = 12345
   ) sub;

db<>fiddle here

It all depends on what you need exactly.

Aside:
Consider type text (or varchar) instead of the seemingly arbitrary varchar(128). See:

  • Related