I have a question about using the flatten function in Snowflake. I'm having trouble with extracting data from following path data:performance: of the following JSON-object:
{
"data": {
"metadata": {
"id": "001",
"created_at": "2020-01-01"
},
"performance": {
"2020-01-01": {
"ad_performances": [{
"ad": "XoGKkgcy7V3BDm6m",
"ad_impressions": 1,
"clicks": 0,
"device": "-3",
"total_net_amount": 0
}, {
"ad": "XoGKkgmFlHa3V5xj",
"ad_impressions": 17,
"clicks": 0,
"device": "-4",
"total_net_amount": 0
}, {
"ad": "XoGKkgmFlHa3V5xj",
"ad_impressions": 5,
"clicks": 0,
"device": "-5",
"total_net_amount": 0
}, {
"ad": "XoGKkgcy7V3BDm6m",
"ad_impressions": 19,
"clicks": 0,
"device": "-2",
"total_net_amount": 0
}, {
"ad": "XoGKkgcy7V3BDm6m",
"ad_impressions": 5,
"clicks": 0,
"device": "-1",
"total_net_amount": 0
}]
}
}
}
Desired result is a table with the "date" (2020-01-01), "ad" and "impressions". I tried to achieve the desired result with:
select
key::date as date
,f.value:performances:ad as performances_array
,f.value:performances:impressions as performances_array
from <table>, lateral flatten (input => CLMN:performances) f;
but I´m not able to extract data from the "performance-array". Can someone help me out?
Thank you!
CodePudding user response:
Can you try this one?
select f.KEY date,
l.VALUE:"ad" as performances_array,
l.VALUE:"impressions" as performances_array
from mydata, lateral flatten (input => CLMN:data.performance ) f,
lateral flatten (input => f.VALUE ) s,
lateral flatten (input => s.VALUE ) l
;
------------ -------------------- --------------------
| DATE | PERFORMANCES_ARRAY | PERFORMANCES_ARRAY |
------------ -------------------- --------------------
| 2020-01-01 | "XoGKkgcy7V3BDm6m" | 1 |
| 2020-01-01 | "XoGKkgmFlHa3V5xj" | 17 |
| 2020-01-01 | "XoGKkgmFlHa3V5xj" | |
| 2020-01-01 | "XoGKkgcy7V3BDm6m" | 19 |
| 2020-01-01 | "XoGKkgcy7V3BDm6m" | 5 |
------------ -------------------- --------------------
CodePudding user response:
Only 2 LATERAL FLATTENs are required to extract the rows
select
a.key::date as ad_date,
b.value:ad::varchar as ad,
b.value:ad_impressions::int as impressions
from j
, lateral flatten(input => v:data:performance) a
, lateral flatten(input => a.value:ad_performances) b;
AD_DATE | AD | IMPRESSIONS |
---|---|---|
2020-01-01 | XoGKkgcy7V3BDm6m | 1 |
2020-01-01 | XoGKkgmFlHa3V5xj | 17 |
2020-01-01 | XoGKkgmFlHa3V5xj | 5 |
2020-01-01 | XoGKkgcy7V3BDm6m | 19 |
2020-01-01 | XoGKkgcy7V3BDm6m | 5 |
If you want to aggregate the data by ad date and ad,
with r as
(
select
a.key::date as ad_date,
b.value:ad::varchar as ad,
b.value:ad_impressions::int as impressions
from j
, lateral flatten(input => v:data:performance) a
, lateral flatten(input => a.value:ad_performances) b
)
select ad_date, ad, sum(impressions) as impressions
from r
group by ad_date, ad;
AD_DATE | AD | IMPRESSIONS |
---|---|---|
2020-01-01 | XoGKkgcy7V3BDm6m | 25 |
2020-01-01 | XoGKkgmFlHa3V5xj | 22 |