I have a list of list of dictionary like this
['[{"date_update":"31-03-2022","diemquatrinh":"6.0"}]',
'[{"date_update":"28-04-2022","diemquatrinh":"6.5"}]',
'[{"date_update":"25-12-2021","diemquatrinh":"6.0"}, {"date_update":"28-04-2022","diemquatrinh":"6.25"},{"date_update":"28-07-2022","diemquatrinh":"6.5"}]',
'[{"date_update":null,"diemquatrinh":null}]']
I don't know how to make them into a DataFrame with 2 columns like this. I'm looking forward to your help. Thank you!
updated_at | diemquatrinh |
---|---|
11-03-2022 | 6.25 |
25-12-2021 | 6.0 |
28-04-2022 | 6.25 |
28-07-2022 | 6.5 |
null | null |
CodePudding user response:
First, convert strings to dictionary.
import pandas as pd
import json
example_data=['[{"date_update":"31-03-2022","diemquatrinh":"6.0"}]',
'[{"date_update":"28-04-2022","diemquatrinh":"6.5"}]',
'[{"date_update":"25-12-2021","diemquatrinh":"6.0"}, {"date_update":"28-04-2022","diemquatrinh":"6.25"},{"date_update":"28-07-2022","diemquatrinh":"6.5"}]',
'[{"date_update":null,"diemquatrinh":null}]']
listt=[]
for i in example_data:
listt.append(json.loads(i))
when i examine the data, each dictionary has the same keys. This means I can collect all dictionaries in one list.
main_list = [item for sublist in listt for item in sublist]
print(main_list)
'''
[{'date_update': '31-03-2022', 'diemquatrinh': '6.0'}, {'date_update': '28-04-2022', 'diemquatrinh': '6.5'}, {'date_update': '25-12-2021', 'diemquatrinh': '6.0'}, {'date_update': '28-04-2022', 'diemquatrinh': '6.25'}, {'date_update': '28-07-2022', 'diemquatrinh': '6.5'}, {'date_update': None, 'diemquatrinh': None}]
'''
All that's left is to convert the list to a dataframe:
df=pd.DataFrame(main_list)
print(df)
'''
date_update diemquatrinh
0 31-03-2022 6.0
1 28-04-2022 6.5
2 25-12-2021 6.0
3 28-04-2022 6.25
4 28-07-2022 6.5
5 None None
'''