Home > OS >  Extract attribute columns from hierarchical JSON dynamically - PostgreSQL
Extract attribute columns from hierarchical JSON dynamically - PostgreSQL

Time:01-09

This query selects a hierarchical JSON as a table in PostgreSQL. The name of the columns could be hard code as done here, but that's not feasible when working with hundreds of columns.

JSON sample:

{
  "NODES": [
    {
      "DESC_D": "fam",
      "SEQ": "1",
      "ID": "2304500",
      "NODES": [
        {
          "DESC_D": "test 1",
          "SEQ": "2.1",
          "ID": "5214",
          "NODES": [
            {
              "DESC_D": "test 1.1",
              "SEQ": "3.1",
              "ID": "999"
            }
          ]
        },
        {
          "DESC_D": "test 2",
          "SEQ": "2.2",
          "ID": "74542"
        }
      ]
    }
  ]
}

Query sample:

SELECT DISTINCT
       child->>'SEQ' AS seq
     , child->>'DESC_D' AS desc_d
     , child->>'ID' AS id
     , parent->>'ID' AS parent_id
     , child->>'NODES' AS nodes
  FROM jsonb_path_query('{"NODES":[{"DESC_D":"fam","SEQ":"1","ID":"2304500","NODES":[{"DESC_D":"test 1","SEQ":"2.1","ID":"5214","NODES":[{"DESC_D":"test 1.1","SEQ":"3.1","ID":"999"}]},{"DESC_D":"test 2","SEQ":"2.2","ID":"74542"}]}]}'
                       , '$.** ? (@.NODES.type() == "array")') AS parent
 CROSS JOIN LATERAL jsonb_array_elements(parent->'NODES') AS child
 ORDER BY seq

I want to get the same output of this code, but without hard coding:

'SEQ' AS seq

'DESC_D' AS desc_d

'ID' AS id

'NODES' AS nodes

I need them dynamically written depending on the JSON

Found this solution but didn't work for me because my JSON is hierarchical (or maybe I didn't implement it the right way).

CodePudding user response:

The proposed solution relies on a composite type which must be dynamically created with the following procedure :

CREATE OR REPLACE PROCEDURE composite_type (j jsonb) LANGUAGE plpgsql AS
$$
DECLARE
  _columns text ;
BEGIN
  SELECT string_agg(DISTINCT v.key || ' text', ',')
    INTO _columns
  FROM jsonb_path_query( j
                       , '$.** ? (@.NODES.type() == "array")') AS parent
 CROSS JOIN LATERAL jsonb_array_elements(parent->'NODES') AS child
 CROSS JOIN LATERAL jsonb_each_text(child) AS v(key, value) ;
 DROP TYPE IF EXISTS composite_type ;
 EXECUTE 'CREATE TYPE composite_type AS (' || _columns || ')' ;
END ;
$$ ;

First create the composite type :

CALL composite_type ('{"NODES":[{"DESC_D":"fam","SEQ":"1","ID":"2304500","NODES":[{"DESC_D":"test 1","SEQ":"2.1","ID":"5214","NODES":[{"DESC_D":"test 1.1","SEQ":"3.1","ID":"999"}]},{"DESC_D":"test 2","SEQ":"2.2","ID":"74542"}]}]}' :: jsonb) ;

And then the query :

SELECT parent->>'ID' AS parent_id
     , (jsonb_populate_record(null :: composite_type, jsonb_object_agg(lower(v.key), v.value))).*
  FROM jsonb_path_query('{"NODES":[{"DESC_D":"fam","SEQ":"1","ID":"2304500","NODES":[{"DESC_D":"test 1","SEQ":"2.1","ID":"5214","NODES":[{"DESC_D":"test 1.1","SEQ":"3.1","ID":"999"}]},{"DESC_D":"test 2","SEQ":"2.2","ID":"74542"}]}]}'
                       , '$.** ? (@.NODES.type() == "array")') AS parent
 CROSS JOIN LATERAL jsonb_array_elements(parent->'NODES') AS child
 CROSS JOIN LATERAL jsonb_each_text(child) AS v(key, value)
 GROUP BY parent, child
 ORDER BY seq

provides the expected result :

parent_id desc_d id nodes seq
null fam 2304500 [{"ID": "5214", "SEQ": "2.1", "NODES": [{"ID": "999", "SEQ": "3.1", "DESC_D": "test 1.1"}], "DESC_D": "test 1"}, {"ID": "74542", "SEQ": "2.2", "DESC_D": "test 2"}] 1
2304500 test 1 5214 [{"ID": "999", "SEQ": "3.1", "DESC_D": "test 1.1"}] 2.1
2304500 test 2 74542 null 2.2
5214 test 1.1 999 null

Note that in the query child is broken down by CROSS JOIN LATERAL jsonb_each_text(child) AS v(key, value) and then rebuild with jsonb_object_agg(lower(v.key), v.value) so that v.key are converted into lower cases and correspond to the attribute names of composite_type which are necessarily in lower cases.

see dbfiddle

  • Related