Home > other >  Self-joining of table ends up as columns instead of rows
Self-joining of table ends up as columns instead of rows

Time:11-14

I have a table with comments, which contains both "parents" and "children". Children can't be a parent, to keep it simple:

CREATE TABLE comments (
    id uuid PRIMARY KEY,
    parent_id uuid REFERENCES comments(id),
    user_id uuid NOT NULL REFERENCES users(id),
    text text NOT NULL,
    created_at timestamp with time zone NOT NULL,
    updated_at timestamp with time zone,
    deleted_at timestamp with time zone
);

My end-goal is to:

  • show 5 of the most recent parent comments in a DESC order
  • show an unlimited amount of children for each of the 5 parent comments, in an ASC order
  • allow pagination to get the next 5 parent comments

What I have tried:

  • CTE recursion (this didn't allow for a specific order and limit)
  • Simple self-join, which results in the child-comments being part of the parent rows:
WITH parent_comments AS (
    SELECT comments.id as id, parent_id, text, user_id, comments.created_at, comments.updated_at, comments.deleted_at
    FROM comments
    WHERE comments.parent_id IS NULL
    ORDER BY created_at DESC
    LIMIT 5 -- at most 5 parent comments
    OFFSET 0
)
SELECT * FROM parent_comments
LEFT JOIN comments AS child_comments
    ON child_comments.parent_id = parent_comments.id
ORDER BY child_comments.created_at ASC;

The problem with this self-join, is that the children are appended as columns to the parent rows:

id  parent_id   text    user_id created_at  updated_at  deleted_at  id  parent_id   user_id text    created_at  updated_at  deleted_at
6c5d2b85-710c-487f-aecb-33b50fb1ddf7        <p>#1</p>   24940941-55cb-4db5-aa36-1b95617eb2d5    2022-11-12 16:12:40.201388 00           9de02105-fc8b-4de1-8a9b-9a877d88b6a9    6c5d2b85-710c-487f-aecb-33b50fb1ddf7    24940941-55cb-4db5-aa36-1b95617eb2d5    <p>#1-3</p> 2022-11-12 16:12:55.233683 00       

When I instead want the children to appear as their own rows:

id  parent_id   text    user_id created_at  updated_at  deleted_at
6c5d2b85-710c-487f-aecb-33b50fb1ddf7        <p>#1</p>   24940941-55cb-4db5-aa36-1b95617eb2d5    2022-11-12 16:12:40.201388 00
9de02105-fc8b-4de1-8a9b-9a877d88b6a9    6c5d2b85-710c-487f-aecb-33b50fb1ddf7    24940941-55cb-4db5-aa36-1b95617eb2d5    <p>#1-3</p> 2022-11-12 16:12:55.233683 00       

CodePudding user response:

You'll need at least one self-join (or subquery) to get the thread creation date that you want to sort by:

SELECT c.*, COALESCE(parent.created_at, c.created_at) AS thread_created_at
FROM comment c
LEFT JOIN comment parent ON parent.id = c.parent_id
ORDER BY thread_created_at DESC, created_at ASC;
SELECT c.*, (CASE c.parent_id IS NUll
  THEN c.created_at
  ELSE (SELECT parent.created_at FROM comment parent WHERE parent.id = c.parent_id)
) AS thread_created_at
FROM comment c
ORDER BY thread_created_at DESC, created_at ASC;

However, limiting that to the first 5 threads is really hard, and you'll also get into problems if two threads were created at the exact same time by pure chance. I recommend that you actually use an aggregating subquery and return a whole thread in a single row:

SELECT *, (
  SELECT json_agg(child ORDER BY created_at ASC)
  FROM comment child
  WHERE child.parent_id = c.id
) AS children
FROM comment c
ORDER BY created_at DESC
LIMIT 5;
  • Related