Home > OS >  how do you parse nested json data in python
how do you parse nested json data in python

Time:03-31

I have this json file.

print(resp1)

{'totalCount': 28, 'nextPageKey': None, 'resolution': '1h', 'result': [{'metricId': 'builtin:tech.generic.cpu.usage', 'data': [{'dimensions': ['PROCESS_GROUP_INSTANCE-116113F2AF6D56C1'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-116113F2AF6D56C1'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.0004967037467478299, 0.0004993697007497151]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-0A36E7BF52579A79'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-0A36E7BF52579A79'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0, 0]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-CD48353ADB461909'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-CD48353ADB461909'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.003568672520272872, 0]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-774E5B0604A8C40F'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-774E5B0604A8C40F'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0, 0]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-BE334E4D9AECB501'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-BE334E4D9AECB501'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.011100043237736794, 0.011380684894064198]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-55B575962189FEB6'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-55B575962189FEB6'}, 'timestamps': [1648551600000, 1648555200000], 'values': [3.0798127390993733, 3.1330792855525362]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-B2C591B795A0B0F5'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-B2C591B795A0B0F5'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0, 0]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-AAFD30937E4A38C1'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-AAFD30937E4A38C1'}, 'timestamps': [1648551600000, 1648555200000], 'values': [10.867152211237094, 10.638132642663043]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-5DBFEAB468303B43'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-5DBFEAB468303B43'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.0358835180952134, 0.03208424664925838]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-848D74392ED576A8'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-848D74392ED576A8'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.05666273674078747, 0.059252078291298686]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-4A2C765539023BD4'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-4A2C765539023BD4'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0, 0]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-D8504F856DB86A46'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-D8504F856DB86A46'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.0003621622612342722, 0.0002603445580040199]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-8384B133274306C5'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-8384B133274306C5'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.01948305709523789, 0.019738118544868802]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-D80C507265E9D7E7'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-D80C507265E9D7E7'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.003096827011896094, 0.0032639247783716172]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-A461758B5AF66329'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-A461758B5AF66329'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.013467390628690916, 0.013690815801205842]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-CE4FB9124B62DCD5'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-CE4FB9124B62DCD5'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.002501596040078672, 0.0026150033093880914]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-D85D67B8E9ECFB3D'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-D85D67B8E9ECFB3D'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.4741804634926945, 0.5191831284675046]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-DE4D85B9C6ED7E5A'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-DE4D85B9C6ED7E5A'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0, 0]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-DD0D9B601CF7262C'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-DD0D9B601CF7262C'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.7368746957244423, 0.6938354934471241]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-FEC12A7E6F89C7A4'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-FEC12A7E6F89C7A4'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0, 0]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-8DECB2924DBD6E00'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-8DECB2924DBD6E00'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0, 0]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-851D35A46B2C9396'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-851D35A46B2C9396'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0, 0]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-7C7997464BBE45D1'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-7C7997464BBE45D1'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.004691369413977879, 0.004811039869336115]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-0553CAB8D5D15C69'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-0553CAB8D5D15C69'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.2077087132276687, 0.21150402124377266]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-C49E4537A35C7539'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-C49E4537A35C7539'}, 'timestamps': [1648551600000, 1648555200000], 'values': [3.11510258227323, 3.1062192170516303]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-07B50EA4014CB57F'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-07B50EA4014CB57F'}, 'timestamps': [1648551600000, 1648555200000], 'values': [5.754310990505163, 5.825864045516305]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-3F3347242CB55D74'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-3F3347242CB55D74'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0.021656050442945992, 0.02297223132589589]}, {'dimensions': ['PROCESS_GROUP_INSTANCE-81339F5C0193A0A8'], 'dimensionMap': {'dt.entity.process_group_instance': 'PROCESS_GROUP_INSTANCE-81339F5C0193A0A8'}, 'timestamps': [1648551600000, 1648555200000], 'values': [0, 0]}]}]}

I need to form a data frame from this json file that includes dt.entity.process_group_instance, timestamps, values

for example the data frame needs to be like this:

    dt.entity.process_group_instance    timestamps          values
PROCESS_GROUP_INSTANCE-116113F2AF6D56C1 1648551600000   0.0004967037467478290
PROCESS_GROUP_INSTANCE-116113F2AF6D56C1 1648555200000   0.0004993697007497150
PROCESS_GROUP_INSTANCE-0A36E7BF52579A79 1648551600000   0.0000000000000000000
PROCESS_GROUP_INSTANCE-0A36E7BF52579A79 1648555200000   0.0000000000000000000
PROCESS_GROUP_INSTANCE-CD48353ADB461909 1648551600000   0.0035686725202728700
PROCESS_GROUP_INSTANCE-CD48353ADB461909 1648555200000   0.0000000000000000000

I have tried this:

df4= (pd.json_normalize(resp1["result"],record_path="data", meta="metricId").drop("dimensions", axis=1))

print(df4)

timestamps  \
0   [1648551600000, 1648555200000]   

                                             values  \
0        [0.04906564288669162, 0.04865638944837782]   

   dimensionMap.dt.entity.process_group_instance  \
0        PROCESS_GROUP_INSTANCE-5D02F0A2CC17C499  

I need this to be this:

dt.entity.process_group_instance            timestamps      values
PROCESS_GROUP_INSTANCE-5D02F0A2CC17C499     1648551600000   0.0490656428866916000
PROCESS_GROUP_INSTANCE-5D02F0A2CC17C499     1648555200000   0.0486563894483778000

any ideas?

CodePudding user response:

You could explode it:

df4= pd.json_normalize(resp1["result"], record_path="data").drop(["dimensions"], axis=1).explode(['timestamps', 'values'])

Output:

       timestamps     values dimensionMap.dt.entity.process_group_instance
0   1648551600000   0.000497       PROCESS_GROUP_INSTANCE-116113F2AF6D56C1
0   1648555200000   0.000499       PROCESS_GROUP_INSTANCE-116113F2AF6D56C1
1   1648551600000          0       PROCESS_GROUP_INSTANCE-0A36E7BF52579A79
1   1648555200000          0       PROCESS_GROUP_INSTANCE-0A36E7BF52579A79
2   1648551600000   0.003569       PROCESS_GROUP_INSTANCE-CD48353ADB461909
2   1648555200000          0       PROCESS_GROUP_INSTANCE-CD48353ADB461909
...
26  1648551600000   0.021656       PROCESS_GROUP_INSTANCE-3F3347242CB55D74
26  1648555200000   0.022972       PROCESS_GROUP_INSTANCE-3F3347242CB55D74
27  1648551600000          0       PROCESS_GROUP_INSTANCE-81339F5C0193A0A8
27  1648555200000          0       PROCESS_GROUP_INSTANCE-81339F5C0193A0A8
  • Related