I need to generate a json file with a specific format from a pandas dataframe. The dataframe looks like this:
user_id | product_id | date |
---|---|---|
1 | 23 | 01-01-2022 |
1 | 24 | 05-01-2022 |
2 | 56 | 05-06-2022 |
3 | 23 | 02-07-2022 |
3 | 24 | 01-02-2022 |
3 | 56 | 02-01-2022 |
And the json file needs to have the following format:
{
"user_id": 1,
"items": [{
"product_id": 23,
"date": 01-01-2022
}, {
"product_id": 24,
"date": 05-01-2022
}]
}
{
"userid": 2,
"items": [{
"product_id": 56,
"date": 05-06-2022
}]
}
...etc
I've tried the following, but it's not the right format:
result = (now.groupby('user_id')['product_id','date'].apply(lambda x: dict(x.values)).to_json())
Any help would be much appreciated!
CodePudding user response:
out = (df[['product_id','date']].apply(dict, axis=1)
.groupby(df['user_id']).apply(list)
.to_frame('items').reset_index()
.to_dict('records'))
print(out)
[{'user_id': 1, 'items': [{'product_id': 23, 'date': '01-01-2022'}, {'product_id': 24, 'date': '05-01-2022'}]},
{'user_id': 2, 'items': [{'product_id': 56, 'date': '05-06-2022'}]},
{'user_id': 3, 'items': [{'product_id': 23, 'date': '02-07-2022'}, {'product_id': 24, 'date': '01-02-2022'}, {'product_id': 56, 'date': '02-01-2022'}]}]