I have a pandas dataframe as below :
device_id user_id level
d_id1 u_id1 1
d_id1 u_id2 -1
d_id1 u_id3 1
d_id2 u_id100 -1
d_id3 u_id100 1
d_id3 u_id4 1
I want to convert it into a list of dictionaries depend on columns and rows that looks like:
[
{
"device_id":d_id1
"result_key":[
{"user_id":u_id1,"level":1},
{"user_id":u_id2,"level":-1},
{"user_id":u_id3,"level":1}
]
},
{
"device_id":d_id2
"result_key":[{"user_id":u_id100,"level":-1}]
},
{
"device_id":d_id3
"result_key":[{"user_id":u_id100,"level":1},
{"user_id":u_id4,"level":1}
]
}
]
What is the effecient way to do that and save it into json file?
CodePudding user response:
Use GroupBy.apply
with DataFrame.to_dict
for result_key
column first, then for dictionary use to_dict
with both columns:
d = (df.groupby('device_id')[['user_id','level']].apply(lambda x: x.to_dict('records'))
.reset_index(name='result_key')
.to_dict('records'))
print (d)
[{'device_id': 'd_id1', 'result_key': [{'user_id': 'u_id1', 'level': 1},
{'user_id': 'u_id2', 'level': -1},
{'user_id': 'u_id3', 'level': 1}]},
{'device_id': 'd_id2', 'result_key': [{'user_id': 'u_id100', 'level': -1}]},
{'device_id': 'd_id3', 'result_key': [{'user_id': 'u_id100', 'level': 1},
{'user_id': 'u_id4', 'level': 1}]}]
Last convert dictionary to json file:
import json
with open('out.json', 'w') as f:
json.dump(d, f)