I would like to parse a complex json file in Azure Data Factory. The structure is the below which means that there are nested objects and arrays. From my understanding ADF can parse arrays but what should we do in order to parse more complex files?
The structure of the file is the below
{
"productA": {
"subcategory 1" : [
{
"name":"x",
"latest buy": "22-12-21"
"total buys": 4
"other comments": "xyzzy"
"history data": [
{
"name":"x",
"latest buy": "22-12-21"
"total buys": 4
"other comments": {"John":"Very nice","Nick":"Not nice"}
}
]
}
}
}
CodePudding user response:
There seems to be some error in you JSON structure you posted. This is not a valid JSON structure. It is missing the commas (,) and braces.
When you have a valid JSON structure, you can use flatten transformation in Data flow to flatten the JSON.
Source:
{
"productA": {
"subcategory 1" : [
{
"name":"x",
"latest buy": "22-12-21",
"total buys": 4,
"other comments": "xyzzy",
"history data": [
{
"name":"x",
"latest buy": "22-12-21",
"total buys": 4,
"other comments": {"John":"Very nice","Nick":"Not nice"}
}
]
}
]
}
}
ADF Data flow:
Source transformation:
Connect the JSON dataset to source transformation and in Source Options, under JSON settings, select a single document.
Source preview:
Flatten transformation:
Here select the array level which you want to unroll in Unroll by and Unroll root and add mappings.
Preview of flatten:
Refer to this parse & flatten documents for more details on parsing the JSON documents in ADF.