id | some_attribute | json_array |
---|---|---|
1 | "abc" | [ { attr: 'apple'}, { attr: 'banana' } ] |
How to get the get rid of attr
in json_array
so that the table results into something like table below?
id | some_attribute | string_array |
---|---|---|
1 | "abc" | [ 'apple', 'banana' ] |
Use case is during the cleaning stage of the data to make further processing and analysis simpler in later stages of the pipeline.
Thx for the help!
CodePudding user response:
One option is to FLATTEN
the json array, then construct the string array from the values.
For example
WITH data AS(
SELECT 1 id, 'abc' as some_attribute
, [{ 'attr': 'apple'}, { 'attr': 'banana' } ] as json_array
)
SELECT
id
, some_attribute
, ARRAY_AGG(value:attr::string) WITHIN GROUP( ORDER BY index) as string_array
FROM
data
, TABLE(FLATTEN(input => json_array))
GROUP BY
id
, some_attribute
which returns
ID|SOME_ATTRIBUTE|STRING_ARRAY |
-- -------------- ------------------
1|abc |["apple","banana"]|
CodePudding user response:
Another option is to create a JavaScript UDF. For example
CREATE OR REPLACE FUNCTION JSON_ARRAY_TO_ARRAY("a" ARRAY)
RETURNS ARRAY
LANGUAGE JAVASCRIPT
AS
$$
let r=[];
a.forEach(e => r.push(e.attr));
return r;
$$
then
WITH data AS(
SELECT 1 id, 'abc' as some_attribute, [{ 'attr': 'apple'}, { 'attr': 'banana' } ] as json_array
)
SELECT
id
, some_attribute
, JSON_ARRAY_TO_ARRAY(json_array) as string_array
FROM
data
again returns
ID|SOME_ATTRIBUTE|STRING_ARRAY |
-- -------------- ------------------
1|abc |["apple","banana"]|