I have standard sensor data coming into Snowflake. Part of it is an array, I a trying to get this array into columns. To do so, I managed to get the array into a separate VARIANT column first, that looks like the one below. Does anyone know how to get this Variant column broken up into individual columns according to the key-value pairs?
Table name: Sensordata
Column name: rx_metadata
Data type: variant
[
{
"channel_index": 7,
"channel_rssi": -87,
"frequency_offset": "-6212",
"gateway_ids": {
"eui": "A84041FFxxx",
"gateway_id": "xxx"
},
"rssi": -87,
"snr": 7,
"timestamp": 1825185681,
"uplink_token": "Ch4KHAoQc3RyYXRpZnktdGVzdC12MRIxxx"
}
]
CodePudding user response:
There are a few steps, your outer object is an array [
]
so if you have only a known amount ( aka one) of entries you can just directly access it.
select parse_json('[1]') as a
,a[0] as inside;
A | INSIDE |
---|---|
[ 1 ] | 1 |
Or if you have an unspecified count of objects, you can use FLATTEN to unroll the values into rows:
select f.value::number as val
from table (flatten(input=>parse_json('[1,2,3]')))f
VAL |
---|
1 |
2 |
3 |
Then after that you have an Object, that you can again directly access if you parameters are known
select f.value:a::number as val
from table (flatten(input=>parse_json('[{"a":1},{"a":2},{"a":3}]')))f
VAL |
---|
1 |
2 |
3 |
Or if you have an arbitrary number of properties per object you can flatten those as well:
select o.key, o.value
from table (flatten(input=>parse_json('[{"a":1},{"a":2},{"a":3}]')))f
,table (flatten(input=>f.value)) o
KEY | VALUE |
---|---|
a | 1 |
a | 2 |
a | 3 |
Thus the last way for you data:
select o.key, o.value
from Sensordata, table (flatten(input=>rx_metadata))f
,table (flatten(input=>f.value)) o
gives:
KEY | VALUE |
---|---|
channel_index | 7 |
channel_rssi | -87 |
frequency_offset | "-6212" |
gateway_ids | { "eui": "A84041FFxxx", "gateway_id": "xxx" } |
rssi | -87 |
snr | 7 |
timestamp | 1825185681 |
uplink_token | "Ch4KHAoQc3RyYXRpZnktdGVzdC12MRIxxx" |
but you could unpack each objects, from the array with:
select
f.value:channel_index,
f.value:channel_rssi,
f.value:frequency_offset,
f.value:gateway_ids:eui,
f.value:gateway_ids:gateway_id,
f.value:rssi,
f.value:snr,
f.value:timestamp,
f.value:uplink_token
from Sensordata, table (flatten(input=>rx_metadata))f
F.VALUE:CHANNEL_INDEX | F.VALUE:CHANNEL_RSSI | F.VALUE:FREQUENCY_OFFSET | F.VALUE:GATEWAY_IDS:EUI | F.VALUE:GATEWAY_IDS:GATEWAY_ID | F.VALUE:RSSI | F.VALUE:SNR | F.VALUE:TIMESTAMP | F.VALUE:UPLINK_TOKEN |
---|---|---|---|---|---|---|---|---|
7 | -87 | "-6212" | "A84041FFxxx" | "xxx" | -87 | 7 | 1825185681 | "Ch4KHAoQc3RyYXRpZnktdGVzdC12MRIxxx" |
but you also could use, directly access the elements of a single sized array:
select
rx_metadata[0]:channel_index,
rx_metadata[0]:channel_rssi,
rx_metadata[0]:frequency_offset,
rx_metadata[0]:gateway_ids:eui,
rx_metadata[0]:gateway_ids:gateway_id,
rx_metadata[0]:rssi,
rx_metadata[0]:snr,
rx_metadata[0]:timestamp,
rx_metadata[0]:uplink_token
from Sensordata
To get the same results. It all depends on things that have not been mentioned.
CodePudding user response:
You need to flatten the structure and then retrieve value from it.
Replace your table column inside CTE -
with data_cte as (
select parse_json('[ { "channel_index": 7,
"channel_rssi": -87,
"frequency_offset": "-6212",
"gateway_ids": { "eui": "A84041FFxxx", "gateway_id": "xxx" },
"rssi": -87,
"snr": 7,
"timestamp": 1825185681,
"uplink_token": "Ch4KHAoQc3RyYXRpZnktdGVzdC12MRIxxx" } ]') as val
)
// Add column list as needed
select value:"channel_index" as channel_index,
value:"channel_rssi" as channel_rssi,
value:"frequency_offset" as frequency_offset
from data_cte, lateral flatten(input=>val);
--------------- -------------- ------------------
| CHANNEL_INDEX | CHANNEL_RSSI | FREQUENCY_OFFSET |
|--------------- -------------- ------------------|
| 7 | -87 | "-6212" |
--------------- -------------- ------------------