Home > Enterprise >  Deep Nested JSON Normalization in Python pandas
Deep Nested JSON Normalization in Python pandas

Time:10-28

I want to convert deep nested JSON file into CSV. JSON file is

{ "abcd-performance:output": { "performance": [ { "sample": { "objectId": "abcd-device:100:12", "ifName": "Carrier1/1", "timeStamp": "2021-10-20T03:15:00.000Z", "type": "radio", "invalid": false, "attribute": [ { "parameter": "rslReadingMean", "value": -36.4 }, { "parameter": "qam-32rx_time", "value": 0 } ] } }, { "sample": { "objectId": "abcd-device:100:12", "ifName": "Carrier1/1", "timeStamp": "2021-10-20T03:30:00.000Z", "type": "radio", "invalid": false, "attribute": [ { "parameter": "rslReadingMean", "value": -36.5 }, { "parameter": "qam-32rx_time", "value": 0 } ] } }, { "sample": { "objectId": "abcd-device:100:13", "ifName": "Terminal", "timeStamp": "2021-10-20T03:30:00.000Z", "type": "sensor", "invalid": false, "attribute": [ { "parameter": "InputCurrent", "value": 1.14 }, { "parameter": "Temperature", "value": 61.5 }, { "parameter": "InputVoltage", "value": 54.777 } ] } } ] } }

code

with open('Performance_Interface_data.json') as f:
    data = json.load(f)
df = pd.json_normalize(data['abcd-performance:output'], 'performance', max_level=4)
print(df.columns)
print(df)

Required output is sample_objectId;sample_ifName;sample_timeStamp;sample_type;sample_invalid;sample_attribute_parameter;sample_attribute_value

i am not able to do the last column normalization.

CodePudding user response:

You will need to specify the arguments to meta and record_path according to the JSON.

df = pd.json_normalize(data['abcd-performance:output']['performance'], record_path=['sample','attribute'], meta=[['sample', 'objectId'], ['sample', 'ifName'], ['sample', 'timeStamp'], ['sample', 'type'], ['sample', 'invalid']])

In [50]: df
Out[50]: 
        parameter   value     sample.objectId sample.ifName          sample.timeStamp sample.type sample.invalid
0  rslReadingMean -36.400  abcd-device:100:12    Carrier1/1  2021-10-20T03:15:00.000Z       radio          False
1   qam-32rx_time   0.000  abcd-device:100:12    Carrier1/1  2021-10-20T03:15:00.000Z       radio          False
2  rslReadingMean -36.500  abcd-device:100:12    Carrier1/1  2021-10-20T03:30:00.000Z       radio          False
3   qam-32rx_time   0.000  abcd-device:100:12    Carrier1/1  2021-10-20T03:30:00.000Z       radio          False
4    InputCurrent   1.140  abcd-device:100:13      Terminal  2021-10-20T03:30:00.000Z      sensor          False
5     Temperature  61.500  abcd-device:100:13      Terminal  2021-10-20T03:30:00.000Z      sensor          False
6    InputVoltage  54.777  abcd-device:100:13      Terminal  2021-10-20T03:30:00.000Z      sensor          False
  • Related