Home > Software design >  Flatten JSON Object in Snowflake
Flatten JSON Object in Snowflake

Time:08-05

I have a question about using the flatten function in Snowflake. I'm having trouble with extracting data from following path data:performance: of the following JSON-object:

   {
"data": {
    "metadata": {
        "id": "001",
        "created_at": "2020-01-01"
    },
    "performance": {
        "2020-01-01": {
            "ad_performances": [{
                "ad": "XoGKkgcy7V3BDm6m",
                "ad_impressions": 1,
                "clicks": 0,
                "device": "-3",
                "total_net_amount": 0
            }, {
                "ad": "XoGKkgmFlHa3V5xj",
                "ad_impressions": 17,
                "clicks": 0,
                "device": "-4",
                "total_net_amount": 0
            }, {
                "ad": "XoGKkgmFlHa3V5xj",
                "ad_impressions": 5,
                "clicks": 0,
                "device": "-5",
                "total_net_amount": 0
            }, {
                "ad": "XoGKkgcy7V3BDm6m",
                "ad_impressions": 19,
                "clicks": 0,
                "device": "-2",
                "total_net_amount": 0
            }, {
                "ad": "XoGKkgcy7V3BDm6m",
                "ad_impressions": 5,
                "clicks": 0,
                "device": "-1",
                "total_net_amount": 0
            }]
        }
    }
}

Desired result is a table with the "date" (2020-01-01), "ad" and "impressions". I tried to achieve the desired result with:

select
   key::date                           as date
  ,f.value:performances:ad             as performances_array
  ,f.value:performances:impressions    as performances_array
from <table>, lateral flatten (input => CLMN:performances) f;

but I´m not able to extract data from the "performance-array". Can someone help me out?

Thank you!

CodePudding user response:

Can you try this one?

select  f.KEY date, 
l.VALUE:"ad"  as performances_array,
l.VALUE:"impressions"  as performances_array
from mydata, lateral flatten (input => CLMN:data.performance ) f,
lateral flatten (input => f.VALUE ) s,
lateral flatten (input => s.VALUE ) l
;

 ------------ -------------------- -------------------- 
|    DATE    | PERFORMANCES_ARRAY | PERFORMANCES_ARRAY |
 ------------ -------------------- -------------------- 
| 2020-01-01 | "XoGKkgcy7V3BDm6m" |                  1 |
| 2020-01-01 | "XoGKkgmFlHa3V5xj" |                 17 |
| 2020-01-01 | "XoGKkgmFlHa3V5xj" |                    |
| 2020-01-01 | "XoGKkgcy7V3BDm6m" |                 19 |
| 2020-01-01 | "XoGKkgcy7V3BDm6m" |                  5 |
 ------------ -------------------- -------------------- 

CodePudding user response:

Only 2 LATERAL FLATTENs are required to extract the rows

select
   a.key::date as ad_date,
   b.value:ad::varchar as ad,
   b.value:ad_impressions::int as impressions
from j
, lateral flatten(input => v:data:performance) a
, lateral flatten(input => a.value:ad_performances) b;
AD_DATE AD IMPRESSIONS
2020-01-01 XoGKkgcy7V3BDm6m 1
2020-01-01 XoGKkgmFlHa3V5xj 17
2020-01-01 XoGKkgmFlHa3V5xj 5
2020-01-01 XoGKkgcy7V3BDm6m 19
2020-01-01 XoGKkgcy7V3BDm6m 5

If you want to aggregate the data by ad date and ad,

with r as
(
  select
    a.key::date as ad_date,
    b.value:ad::varchar as ad,
    b.value:ad_impressions::int as impressions
  from j
  , lateral flatten(input => v:data:performance) a
  , lateral flatten(input => a.value:ad_performances) b
)
select ad_date, ad, sum(impressions) as impressions
from r
group by ad_date, ad;
AD_DATE AD IMPRESSIONS
2020-01-01 XoGKkgcy7V3BDm6m 25
2020-01-01 XoGKkgmFlHa3V5xj 22
  • Related