Im trying to ingest data from a json file and generate new columns with transformed data.
Im trying to do sth like that but cannot find the way to solve it.
.create table example ingestion json mapping "example_mapping"
'['
'{ "column":"timestamp", "Properties": {"Path":"$.timestamp"}},'
'{ "column":"rawValue", "Properties":{"Path":"$.rawValue"}},'
'{ "column":"Name", "Properties":{"Path":"$.Name"}},'
'{ "column":"Name1", "Properties":{"ConstantValue":"$.split(Name, "_")[1]"}},'
'{ "column":"Name2", "Properties":{"Path":"$[\' split(Name, "_")[1]\']"}},'
'{ "column":"Name3", "Properties":{"Path":"$[\' split(Name, "_")[2]\']"}},'
'{ "column":"Name4", "Properties":{"Path":"$[\' split(Name, "_")[3]\']"}},'
'{ "column":"Name5", "Properties":{"Path":"$[\' split(Name, "_")[4]\']"}},'
'{ "column":"Name6", "Properties":{"Path":"$[\' split(Name, "_")[5]\']"}},'
'{ "column":"Name7", "Properties":{"Path":"$[\' split(Name, "_")[6]\']"}},'
'{ "column":"Name8", "Properties":{"Path":"$[\' split(Name, "_")[7]\']"}},'
'{ "column":"RelevantData", "Properties":{"Path":"$[\'RelevantData\']"}}'
']'
Name1, 2, 3, 4... are the new columns and Im trying to do a splitsplit(Name, "_")[0]
in order to make the transformation in th eingestion mapping.
¿Anybody knows if this is possible? Advice would be welcome. Thanks
CodePudding user response:
A good practice to achieve this is using a staging table and an update policy to propagate the data to the target table:
(1) Create a raw table, where data is initially ingested. This table stores the original json data:
.create table raw (d : dynamic)
(2) Create an ingestion mapping for the raw table:
.create table raw ingestion json mapping "ingest-raw"' [{"column":"d","Properties":{"path":"$"}}]'
(3) Create the table with the transformed data:
.create table measurement (deviceId:string, enqueuedTime:datetime, temperature:real)
(4) Define the data transformation from the raw to the measurement table
.create-or-alter function
with (docstring = 'update policy for measurement', folder='PBI')
rawtransform() {
raw
| extend m=todynamic(d.measurement)
| mv-expand (m)
| extend deviceId=tostring(m.header.deviceId),
enqueuedTime=todatetime(m.header.enqueuedTime),
temperature=toreal(m.temperature)
| project-away d, m}
(5) Define an update policy definition for the measurement table, with this new ingested data to table raw will be inserted in the measurement table
.alter table measurement policy update @'[{"IsEnabled": true, "Source":
"raw", "Query": "rawtransform()", "IsTransactional": false,
"PropagateIngestionProperties": false}]'
For the staging table you can define a shorter caching period, if the data is not queried so often.
CodePudding user response:
Short answer:
Not supported as part of data mappings
Detailed answer:
JSONPath expressions supports $
, .
, ['property']
or ["property"]
and [n]
.
Mapping transformations adds some additional capabilities around source file info (path & line number), unix-time (transformation to datetime) and transformation of JSON array of properties to dictionary.
Solution:
Use Update Policy
P.S.
Please note that arrays' indexes start with 0
Demo:
.create table exampleTmp (timestamp:datetime, rawValue:string, Name:string)
// This will keep the temporary table empty
.alter-merge table exampleTmp policy retention softdelete = 0s
.create table example (timestamp:datetime, rawValue:string, Name:string, Name1:string, Name2:string, Name3:string, Name4:string, Name5:string, Name6:string, Name7:string, Name8:string)
.create-or-alter function transform_exampleTmp_to_example()
{
exampleTmp
| extend Name_array = split(Name, "_")
| extend Name1 = tostring(Name_array[0]), Name2 = tostring(Name_array[1]), Name3 = tostring(Name_array[2]), Name4 = tostring(Name_array[3])
| extend Name5 = tostring(Name_array[4]), Name6 = tostring(Name_array[5]), Name7 = tostring(Name_array[6]), Name8 = tostring(Name_array[7])
| project-away Name_array
}
.alter-merge table example policy update @'[{"IsEnabled": true, "Source": "exampleTmp", "Query": "transform_exampleTmp_to_example()", "IsTransactional": true, "PropagateIngestionProperties": true}]'
.ingest inline into table exampleTmp with (format="JSON") <|
{"timestamp":"2022-07-13 13:33:33","rawValue":"Hello","Name":"aaa_bbb_ccc_ddd_eee_fff_ggg_hhh"}
{"timestamp":"2022-07-14 14:44:44","rawValue":"world","Name":"aa_bb_cc_dd_ee_ff_gg_hh"}
.ingest inline into table exampleTmp with (format="JSON") <|
{"timestamp":"2022-07-15 15:55:55","rawValue":"!","Name":"a_b_c_d_e_f_g_h"}
exampleTmp
timestamp | rawValue | Name |
---|
example
timestamp | rawValue | Name | Name1 | Name2 | Name3 | Name4 | Name5 | Name6 | Name7 | Name8 |
---|---|---|---|---|---|---|---|---|---|---|
2022-07-13T13:33:33Z | Hello | aaa_bbb_ccc_ddd_eee_fff_ggg_hhh | aaa | bbb | ccc | ddd | eee | fff | ggg | hhh |
2022-07-14T14:44:44Z | world | aa_bb_cc_dd_ee_ff_gg_hh | aa | bb | cc | dd | ee | ff | gg | hh |
2022-07-15T15:55:55Z | ! | a_b_c_d_e_f_g_h | a | b | c | d | e | f | g | h |