I'm trying to create a dataframe from a json array, but I'm stuck with getting it unnested nicely.
So the Json Array is like this:
{'availability':
{'2022-07-15': [{'id': '1234', 'quantity': 14}, {'id': '5678', 'quantity': 18}, {'id': '2345', 'quantity': 20}],
'2022-07-17': [{'id': '1234', 'quantity': 2}, {'id': '5678', 'quantity': 74}, {'id': '2345', 'quantity': 64}]}
}
What I'm trying to do is to create a data frame that looks like this:
id quantity
2022-07-15 1234 14
2022-07-15 5678 18
2022-07-15 2345 20
2022-07-17 1234 2
2022-07-17 5678 74
2022-07-17 2345 64
But I seem to get stuck at the 'availability' part. I tried to use
df = pd.json_normalize(data, record_path=['id']. meta=['quantity']
But it meantions that the key 'id' could not be found. It seems like 'availability' is one key, but the date doesn't seem to be. Could someone guide me to the right direction?
CodePudding user response:
If dct
is the dictionary from the question, you can do:
df = pd.DataFrame(
({"index": k, **d} for k, l in dct["availability"].items() for d in l),
).set_index("index")
df.index.name = None
print(df)
Prints:
id quantity
2022-07-15 1234 14
2022-07-15 5678 18
2022-07-15 2345 20
2022-07-17 1234 2
2022-07-17 5678 74
2022-07-17 2345 64