I have a table with the following schema (postgresql 14):
message sentiment classification
any text positive mobile, communication
message
are only string, phrases.
sentiment
is a string, only one word
classification
are string but can have 1 to many word comma separated
I would like to create a json field with these columns, like this:
{"msg":"any text", "sentiment":"positive","classification":["mobile,"communication"]}
Also, if possible, is there a way to consider the classification this way:
{"msg":"any text", "sentiment":"positive","classification 1":"mobile","classification 2" communication"}
CodePudding user response:
The first part of question is easy - Postgres provides functions for splitting string and converting to json:
with t(message, sentiment, classification) as (values
('any text','positive','mobile, communication')
)
select row_to_json(x.*)
from (
select t.message
, t.sentiment
, array_to_json(string_to_array(t.classification, ', ')) as classification
from t
) x
The second part is harder - your want json to have variable number of attributes, mixed of grouped and nongrouped data. I suggest to unwind all attributes and then assemble them back (note the numbered
CTE is actually not needed if your real table has id - I just needed some column to group by):
with t(message, sentiment, classification) as (values
('any text','positive','mobile, communication')
)
, numbered (id, message, sentiment, classification) as (
select row_number() over (order by null)
, t.*
from t
)
, extracted (id,message,sentiment,classification,index) as (
select n.id
, n.message
, n.sentiment
, l.c
, l.i
from numbered n
join lateral unnest(string_to_array(n.classification, ', ')) with ordinality l(c,i) on true
), unioned (id, attribute, value) as (
select id, concat('classification ', index::text), classification
from extracted
union all
select id, 'message', message
from numbered
union all
select id, 'sentiment', sentiment
from numbered
)
select json_object_agg(attribute, value)
from unioned
group by id;
CodePudding user response:
Use jsonb_build_object
and concatenate the columns you want
SELECT
jsonb_build_object(
'msg',message,
'sentiment',sentiment,
'classification',
string_to_array(classification,','))
FROM mytable;
Demo: db<>fiddle
The second output is definitely not trivial. The SQL code would be much larger and harder to maintain - not to mention that parsing such file also requires a little more effort.
CodePudding user response:
You can use a cte
to handle the flattening of the classification attributes and then perform the necessary grouping in the main queries for each problem component:
with cte(r, m, s, k) as (
select row_number() over (order by t.message), t.message, t.sentiment, v.* from tbl t
cross join json_array_elements(array_to_json(string_to_array(t.classification, ', '))) v
)
-- first part --
select json_build_object('msg', t1.message, 'sentiment', t1.sentiment, 'classification', string_to_array(t1.classification, ', ')) from tbl t1
-- second part --
select jsonb_build_object('msg', t1.m, 'sentiment', t1.s)||('{'||t1.g||'}')::jsonb
from (select c.m, c.s, array_to_string(array_agg('"classification '||c.r||'":'||c.k), ', ') g
from cte c group by c.m, c.s) t1