I get an arrayed multi level JSON response with one level which is a date and another one which duplicats the data in a lower lever array from an API which looks like this:
{
"2021-11-04": {
"40-41 (25-27)": {
"sales": 26,
"balance": 480,
"size_name": "40-41",
"size_origin": "25-27"
},
"42-43 (27-29)": {
"sales": 63,
"balance": 817,
"size_name": "42-43",
"size_origin": "27-29"
}
},
"2021-11-05": {
"40-41 (25-27)": {
"sales": 35,
"balance": 445,
"size_name": "40-41",
"size_origin": "25-27"
},
"42-43 (27-29)": {
"sales": 95,
"balance": 725,
"size_name": "42-43",
"size_origin": "27-29"
}
}
}
But what i need is to make it not an array but a flatten objects to easily form a dataframe with pandas. How it can be done?
The needed result:
{
{
"day": "2021-11-04",
"sales": 26,
"balance": 480,
"size_name": "40-41",
"size_origin": "25-27"
},
{
"day": "2021-11-04",
"sales": 63,
"balance": 817,
"size_name": "42-43",
"size_origin": "27-29"
},
{
"day": "2021-11-05",
"sales": 35,
"balance": 445,
"size_name": "40-41",
"size_origin": "25-27"
},
{
"day": "2021-11-05",
"sales": 95,
"balance": 725,
"size_name": "42-43",
"size_origin": "27-29"
}
}
I'm ok with transformation of it inside pandas and not while it's in JSON format but I still don't get it how can such structure can be transformed.
CodePudding user response:
You can do it with a simple double loop
data = {
"2021-11-04": {
"40-41 (25-27)": {
"sales": 26,
"balance": 480,
"size_name": "40-41",
"size_origin": "25-27"
},
"42-43 (27-29)": {
"sales": 63,
"balance": 817,
"size_name": "42-43",
"size_origin": "27-29"
}
},
"2021-11-05": {
"40-41 (25-27)": {
"sales": 35,
"balance": 445,
"size_name": "40-41",
"size_origin": "25-27"
},
"42-43 (27-29)": {
"sales": 95,
"balance": 725,
"size_name": "42-43",
"size_origin": "27-29"
}
}
}
records = []
for date, date_dict in data.items():
for rec_id, rec in date_dict.items():
rec['day'] = date
records.append(rec)
Output
>>> records
[{'sales': 26,
'balance': 480,
'size_name': '40-41',
'size_origin': '25-27',
'day': '2021-11-04'},
{'sales': 63,
'balance': 817,
'size_name': '42-43',
'size_origin': '27-29',
'day': '2021-11-04'},
{'sales': 35,
'balance': 445,
'size_name': '40-41',
'size_origin': '25-27',
'day': '2021-11-05'},
{'sales': 95,
'balance': 725,
'size_name': '42-43',
'size_origin': '27-29',
'day': '2021-11-05'}]
>>> pd.DataFrame(records)
sales balance size_name size_origin day
0 26 480 40-41 25-27 2021-11-04
1 63 817 42-43 27-29 2021-11-04
2 35 445 40-41 25-27 2021-11-05
3 95 725 42-43 27-29 2021-11-05