I have a problem. I have a json
that contain logs. Each logs
could have a change
but
does not have to. So I want to show the logs
with the changes
in a pandas. But unfortunately the changes is not splitted in different cells (e.g. id
and operation
). So the complete changes
is one cell. How could I get the following output like below?
{'model':
{'logs':
[{'id': '001',
'changes':
[{'id': '101','operation': 'UPDATE'}]},
{'id': '0000000017b068fc',
'changes':
[{'id': '201','operation': 'UPDATE'}]},
{'id': '002',
'changes':
[{'id': '102','operation': 'UPDATE'},
{'id': '202','operation': 'UPDATE'},
{'id': '302','operation': 'UPDATE'}]},
{'id': '003',
'changes':
[]},
{'id': '004',
'changes':
[]},
Code
import pandas as pd
df=pd.json_normalize(result['model']['logs'])
What I got
[OUT]
id changes
0 001 [{'id': '101'...
1 002 [{'id': '102'...
3 003 []
What I want
[OUT]
id changes.id changes.operation
0 001 101 UPDATE
1 001 201 UPDATE
2 002 102 UPDATE
3 002 202 UPDATE
4 002 302 UPDATE
5 003 None None
6 004 None None
CodePudding user response:
Try:
out = pd.json_normalize(result['model']['logs'], 'changes', 'id',
record_prefix='changes.', errors='ignore')
print(out)
# Output
changes.id changes.operation id
0 101 UPDATE 001
1 201 UPDATE 0000000017b068fc
2 102 UPDATE 002
3 202 UPDATE 002
4 302 UPDATE 002
Update
If you need to keep NaN rows, use:
out = pd.json_normalize(result['model']['logs']).explode('changes')
changes = pd.DataFrame(out.pop('changes').to_dict()).T.add_prefix('changes.')
out = pd.concat([out, changes], axis=1)
print(out)
# Output
id changes.id changes.operation
0 001 101 UPDATE
1 0000000017b068fc 201 UPDATE
2 002 302 UPDATE
2 002 302 UPDATE
2 002 302 UPDATE
3 003 NaN NaN
4 004 NaN NaN