I have a table that stores values like this:
| id | thing | values |
|----|-------|--------|
| 1 | a |[1, 2] |
| 2 | b |[2, 3] |
| 3 | a |[2, 3] |
And would like to use an aggregate function to group by thing
but store only the unique values of the array such that the result would be:
| thing | values |
|-------|---------|
| a |[1, 2, 3]|
| b |[2, 3] |
Is there a simple and performant way of doing this in Postgres?
CodePudding user response:
First you take the JSON array apart with json_array_elements()
- this is a set-returning function with a JOIN LATERAL
you get a row with id
, thing
and a JSON array element for each element.
Then you select DISTINCT
records for thing
and value
, ordered by value
.
Finally you aggregate records back together with json_agg()
.
In SQL that looks like:
SELECT thing, json_agg(value) AS values
FROM (
SELECT DISTINCT thing, value
FROM t
JOIN LATERAL json_array_elements(t.values) AS v(value) ON true
ORDER BY value) x
GROUP BY thing
In general you would want to use the jsonb
type as that is more efficient than json
. Then you'd have to use the corresponding jsonb_...()
functions.