I have this json right here, I want to create a DF using id as one column, and the fields as the other columns, but not the whole fields, I want a column for message, a column for result, how do I achieve this?
[{'id': 'rec6cGAtjL8lmoBgT', 'createdTime': '2022-10-11T22:54:23.000Z', 'fields': {'result': 3, 'message': "Today was meh, but I got a good bit of work done, so it's fine", 'isitweird': True}}, {'id': 'recG1bde9NYyWs9Gm', 'createdTime': '2022-10-12T08:30:20.000Z', 'fields': {'message': "Today was meh, but I got a good bit of work done, so it's fine", 'result': 3, 'isitweird': True}}, {'id': 'rectKvuz20jYYQ7OC', 'createdTime': '2022-10-11T16:39:26.000Z', 'fields': {'result': 5, 'message': 'I am very happy today, it was awesome', 'isitweird': True}}, {'id': 'rectQfHPJZbdDtIpW', 'createdTime': '2022-10-11T22:54:34.000Z', 'fields': {'result': 3, 'message': "Today was meh, but I got a good bit of work done, so it's fine", 'isitweird': True}}, {'id': 'recxbt94M70AjBQS7', 'createdTime': '2022-10-11T22:52:42.000Z', 'fields': {'result': 3, 'message': "Today was meh, but I got a good bit of work done, so it's fine", 'isitweird': True}}, {'id': 'recy1DjwMNOIfuWa6', 'createdTime': '2022-10-11T22:53:56.000Z', 'fields': {'result': 3, 'message': "Today was meh, but I got a good bit of work done, so it's fine", 'isitweird': True}}]
CodePudding user response:
You can create a new list of dicts using flatter function and map function:
Input:
json_input = [{'id': 'rec6cGAtjL8lmoBgT', 'createdTime': '2022-10-11T22:54:23.000Z', 'fields': {'result': 3, 'message': "Today was meh, but I got a good bit of work done, so it's fine", 'isitweird': True}}, {'id': 'recG1bde9NYyWs9Gm', 'createdTime': '2022-10-12T08:30:20.000Z', 'fields': {'message': "Today was meh, but I got a good bit of work done, so it's fine", 'result': 3, 'isitweird': True}}, {'id': 'rectKvuz20jYYQ7OC', 'createdTime': '2022-10-11T16:39:26.000Z', 'fields': {'result': 5, 'message': 'I am very happy today, it was awesome', 'isitweird': True}}, {'id': 'rectQfHPJZbdDtIpW', 'createdTime': '2022-10-11T22:54:34.000Z', 'fields': {'result': 3, 'message': "Today was meh, but I got a good bit of work done, so it's fine", 'isitweird': True}}, {'id': 'recxbt94M70AjBQS7', 'createdTime': '2022-10-11T22:52:42.000Z', 'fields': {'result': 3, 'message': "Today was meh, but I got a good bit of work done, so it's fine", 'isitweird': True}}, {'id': 'recy1DjwMNOIfuWa6', 'createdTime': '2022-10-11T22:53:56.000Z', 'fields': {'result': 3, 'message': "Today was meh, but I got a good bit of work done, so it's fine", 'isitweird': True}}]
Script:
def flatter(data_row):
return {'id': data_row['id'], 'result': data_row['fields']['result'], 'message': data_row['fields']['message']}
pd.DataFrame(map(flatter, json_input))
Output:
id result message
0 rec6cGAtjL8lmoBgT 3 Today was meh, but I got a good bit of work do...
1 recG1bde9NYyWs9Gm 3 Today was meh, but I got a good bit of work do...
2 rectKvuz20jYYQ7OC 5 I am very happy today, it was awesome
3 rectQfHPJZbdDtIpW 3 Today was meh, but I got a good bit of work do...
4 recxbt94M70AjBQS7 3 Today was meh, but I got a good bit of work do...
5 recy1DjwMNOIfuWa6 3 Today was meh, but I got a good bit of work do...
Another option is to create a DataFrame and then to apply some functions:
df = pd.DataFrame(json_input)
df['message'] = df['fields'].str['message']
df['result'] = df['fields'].str['result']
df = df.drop(columns=['createdTime', 'fields'])
df
Output:
id message result
0 rec6cGAtjL8lmoBgT Today was meh, but I got a good bit of work do... 3
1 recG1bde9NYyWs9Gm Today was meh, but I got a good bit of work do... 3
2 rectKvuz20jYYQ7OC I am very happy today, it was awesome 5
3 rectQfHPJZbdDtIpW Today was meh, but I got a good bit of work do... 3
4 recxbt94M70AjBQS7 Today was meh, but I got a good bit of work do... 3
5 recy1DjwMNOIfuWa6 Today was meh, but I got a good bit of work do... 3