Home > Software engineering >  Is it possible to modify data in the ingestion mapping using Azure Data Explorer?
Is it possible to modify data in the ingestion mapping using Azure Data Explorer?

Time:07-14

Im trying to ingest data from a json file and generate new columns with transformed data.

Im trying to do sth like that but cannot find the way to solve it.

.create table example ingestion json mapping "example_mapping" 

    '['
        '{   "column":"timestamp", "Properties": {"Path":"$.timestamp"}},'
        '{   "column":"rawValue", "Properties":{"Path":"$.rawValue"}},'
        '{   "column":"Name", "Properties":{"Path":"$.Name"}},'
        '{   "column":"Name1", "Properties":{"ConstantValue":"$.split(Name, "_")[1]"}},'
        '{   "column":"Name2", "Properties":{"Path":"$[\' split(Name, "_")[1]\']"}},'
        '{   "column":"Name3", "Properties":{"Path":"$[\' split(Name, "_")[2]\']"}},'
        '{   "column":"Name4", "Properties":{"Path":"$[\' split(Name, "_")[3]\']"}},'
        '{   "column":"Name5", "Properties":{"Path":"$[\' split(Name, "_")[4]\']"}},'
        '{   "column":"Name6", "Properties":{"Path":"$[\' split(Name, "_")[5]\']"}},'
        '{   "column":"Name7", "Properties":{"Path":"$[\' split(Name, "_")[6]\']"}},'
        '{   "column":"Name8", "Properties":{"Path":"$[\' split(Name, "_")[7]\']"}},'
        '{   "column":"RelevantData", "Properties":{"Path":"$[\'RelevantData\']"}}'
    ']'

Name1, 2, 3, 4... are the new columns and Im trying to do a splitsplit(Name, "_")[0] in order to make the transformation in th eingestion mapping.

¿Anybody knows if this is possible? Advice would be welcome. Thanks

CodePudding user response:

A good practice to achieve this is using a staging table and an update policy to propagate the data to the target table:

(1) Create a raw table, where data is initially ingested. This table stores the original json data:

.create table raw (d : dynamic)

(2) Create an ingestion mapping for the raw table:

.create table raw ingestion json mapping "ingest-raw"' [{"column":"d","Properties":{"path":"$"}}]'

(3) Create the table with the transformed data:

.create table measurement (deviceId:string, enqueuedTime:datetime, temperature:real)

(4) Define the data transformation from the raw to the measurement table

.create-or-alter function 
with (docstring = 'update policy for measurement', folder='PBI')
rawtransform()  {
raw
| extend m=todynamic(d.measurement)
| mv-expand (m)
| extend deviceId=tostring(m.header.deviceId), 
enqueuedTime=todatetime(m.header.enqueuedTime), 
temperature=toreal(m.temperature)
| project-away d, m}

(5) Define an update policy definition for the measurement table, with this new ingested data to table raw will be inserted in the measurement table

.alter table measurement policy update @'[{"IsEnabled": true, "Source": 
"raw", "Query": "rawtransform()", "IsTransactional": false, 
"PropagateIngestionProperties": false}]'

For the staging table you can define a shorter caching period, if the data is not queried so often.

CodePudding user response:

Short answer:

Not supported as part of data mappings

Detailed answer:

JSONPath expressions supports $, ., ['property'] or ["property"] and [n].
Mapping transformations adds some additional capabilities around source file info (path & line number), unix-time (transformation to datetime) and transformation of JSON array of properties to dictionary.

Solution:

Use Update Policy

P.S.

Please note that arrays' indexes start with 0

Demo:
.create table exampleTmp (timestamp:datetime, rawValue:string, Name:string)

// This will keep the temporary table empty
.alter-merge table exampleTmp policy retention softdelete = 0s

.create table example (timestamp:datetime, rawValue:string, Name:string, Name1:string, Name2:string, Name3:string, Name4:string, Name5:string, Name6:string, Name7:string, Name8:string)

.create-or-alter function transform_exampleTmp_to_example()
{
    exampleTmp
    | extend Name_array = split(Name, "_")
    | extend Name1 = tostring(Name_array[0]), Name2 = tostring(Name_array[1]), Name3 = tostring(Name_array[2]), Name4 = tostring(Name_array[3])
    | extend Name5 = tostring(Name_array[4]), Name6 = tostring(Name_array[5]), Name7 = tostring(Name_array[6]), Name8 = tostring(Name_array[7])
    | project-away Name_array
}

.alter-merge table example policy update @'[{"IsEnabled": true, "Source": "exampleTmp", "Query": "transform_exampleTmp_to_example()", "IsTransactional": true, "PropagateIngestionProperties": true}]'

.ingest inline into table exampleTmp with (format="JSON") <|
{"timestamp":"2022-07-13 13:33:33","rawValue":"Hello","Name":"aaa_bbb_ccc_ddd_eee_fff_ggg_hhh"}
{"timestamp":"2022-07-14 14:44:44","rawValue":"world","Name":"aa_bb_cc_dd_ee_ff_gg_hh"}

.ingest inline into table exampleTmp with (format="JSON") <|
{"timestamp":"2022-07-15 15:55:55","rawValue":"!","Name":"a_b_c_d_e_f_g_h"}

exampleTmp
timestamp rawValue Name

example
timestamp rawValue Name Name1 Name2 Name3 Name4 Name5 Name6 Name7 Name8
2022-07-13T13:33:33Z Hello aaa_bbb_ccc_ddd_eee_fff_ggg_hhh aaa bbb ccc ddd eee fff ggg hhh
2022-07-14T14:44:44Z world aa_bb_cc_dd_ee_ff_gg_hh aa bb cc dd ee ff gg hh
2022-07-15T15:55:55Z ! a_b_c_d_e_f_g_h a b c d e f g h
  • Related