Parsing JSON list in Snowflake - converting redshift sql to snowflake sql-CodePudding

I have some Redshift SQL that I'm trying to convert to snowflake SQL to extract values from a json field. The issue I'm running into is the specification of the index required.

Because I run A/B/n tests, there can be multiple indexes up to 'n'.

So I had this piece of SQL working for Redshift:

SELECT JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (e.splits,n.n),'split_type') types
     , JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (e.splits,n.n),'weight') as weight
FROM experiments e, (SELECT  (p0.n   p1.n*2   p2.n * POWER(2,2)   p3.n * POWER(2,3)   p4.n * POWER(2,4)   p5.n * POWER(2,5)
                                p6.n * POWER(2,6)   p7.n * POWER(2,7)   p8.n * POWER(2,8)   p9.n * POWER(2,9))::int as n
                     FROM
                         (SELECT 0 as n UNION SELECT 1) p0,
                         (SELECT 0 as n UNION SELECT 1) p1,
                         (SELECT 0 as n UNION SELECT 1) p2,
                         (SELECT 0 as n UNION SELECT 1) p3,
                         (SELECT 0 as n UNION SELECT 1) p4,
                         (SELECT 0 as n UNION SELECT 1) p5,
                         (SELECT 0 as n UNION SELECT 1) p6,
                         (SELECT 0 as n UNION SELECT 1) p7,
                         (SELECT 0 as n UNION SELECT 1) p8,
                         (SELECT 0 as n UNION SELECT 1) p9
                     Order by 1
                     ) n
WHERE types <> ''
AND weight <> ''

From reading snowlfake's documentation, it would seem like the following should work:

SELECT parse_json(parse_json(e.splits)[n.n]):split_type as types,
       parse_json(parse_json(e.splits)[n.n]):weight as weight
FROM experiments e, (SELECT (p0.n ...

However I get the error "SQL compilation error: error line 1 at position 39 invalid identifier 'N.N'"

I'm wondering if someone would be able to help with this issue?

EDIT:

experiments table looks like:

exp_ID | splits
1  | [{"id":203,"weight":50,"split_type":"a"},{"id":204,"weight":50,"split_type":"control"}]
2  | [{"id":205,"weight":33.33,"split_type":"a"},{"id":206,"weight":33.33,"split_type":"b"},{"id":207,"weight":33.33,"split_type":"c"}]
3  | [{"id":208,"weight":25,"split_type":"a"},{"id":209,"weight":25,"split_type":"b"},{"id":210,"weight":25,"split_type":"c"},{"id":211,"weight":25,"split_type":"d"}]

required output:

exp_ID | ID  | types   | weight
1      | 203 | a       | 50
1      | 204 | control | 50
2      | 205 | a       | 33.33
2      | 206 | b       | 33.33
2      | 207 | c       | 33.33
3      | 208 | a       | 25
3      | 209 | b       | 25
3      | 210 | c       | 25
3      | 211 | d       | 25

CodePudding user response：

With a table defined as

create temp table EXPERIMENTS(EMP_ID int, SPLITS variant);

You can insert rows like this (This is just for testing. Do not use single-row inserts for production pipelines):

insert into experiments select 1, parse_json('[{"id":203,"weight":50,"split_type":"a"},{"id":204,"weight":50,"split_type":"control"}]');
insert into experiments select 2, parse_json('[{"id":205,"weight":33.33,"split_type":"a"},{"id":206,"weight":33.33,"split_type":"b"},{"id":207,"weight":33.33,"split_type":"c"}]');
insert into experiments select 3, parse_json('[{"id":208,"weight":25,"split_type":"a"},{"id":209,"weight":25,"split_type":"b"},{"id":210,"weight":25,"split_type":"c"},{"id":211,"weight":25,"split_type":"d"}]');

With it stored in the table that way, you can query the JSON in columns like this:

select   EXP_ID
        ,VALUE:id                 as ID
        ,VALUE:split_type::string as TYPES
        ,VALUE:weight             as WEIGHT
from experiments
     ,lateral flatten(splits)

CodePudding user response：

The article below is to demonstrate various examples of using LATERAL FLATTEN to extract information from a JSON Document. Examples are provided for its utilization together with GET_PATH, UNPIVOT, and SEQ functions.

https://community.snowflake.com/s/article/Dynamically-extracting-JSON-using-LATERAL-FLATTEN