I have a list of dicts and I want to merge dicts that have the same key, value for later use in a dataframe. PS: I don't know if pandas can handle this by itself. Maybe it's not necessary to append the dicts.
My list of dicts:
[{'country': 'Brazil', 'State': 'São Paulo', 'description': '"Estado de São Paulo"'},
{'country': 'Brazil', 'State': 'Rio de Janeiro', 'description': '"Estado do Rio de Janeiro"'},
{'country': 'Brazil', 'State': 'Rio de Janeiro', 'population': '12345'},
{'country': 'Brazil', 'State': 'Rio de Janeiro', 'work_force': '2345'},
{'country': 'Brazil', 'State': 'Paraná', 'description': '"Estado do Paraná"'},
{'country': 'Brazil', 'State': 'Santa Catarina', 'description': '"Estado de Santa Catarina"'},
{'country': 'Brazil', 'State': 'Santa Catarina', 'population': '54321'},
{'country': 'Brazil', 'State': 'Santa Catarina', 'work_force': '4321'}]
Output when I create the dataframe:
country State description population work_force
0 Brazil São Paulo "Estado de São Paulo" NaN NaN
1 Brazil Rio de Janeiro "Estado do Rio de Janeiro" NaN NaN
2 Brazil Rio de Janeiro NaN 12345 NaN
3 Brazil Rio de Janeiro NaN NaN 2345
4 Brazil Paraná "Estado do Paraná" NaN NaN
5 Brazil Santa Catarina "Estado de Santa Catarina" NaN NaN
6 Brazil Santa Catarina NaN 54321 NaN
7 Brazil Santa Catarina NaN NaN 4321
What I need:
country State description population work_force
0 Brazil São Paulo "Estado de São Paulo" NaN NaN
1 Brazil Rio de Janeiro "Estado do Rio de Janeiro" 12345 2345
2 Brazil Paraná "Estado do Paraná" NaN NaN
3 Brazil Santa Catarina "Estado de Santa Catarina" 54321 4321
The desired output is achieved when I merge the dicts:
{'country': 'Brazil', 'State': 'Rio de Janeiro', 'description': '"Estado do Rio de Janeiro"', 'population': '12345', 'work_force': '2345'},
{'country': 'Brazil', 'State': 'Paraná', 'description': '"Estado do Paraná"'},
{'country': 'Brazil', 'State': 'Santa Catarina', 'description': '"Estado de Santa Catarina"', 'population': '54321', 'work_force': '4321'}]
So I'm looking for ways of merging thoses dicts into a single on based on the key "State".
CodePudding user response:
yes pandas can handle this. group on country and state and take the first value:
df.groupby(["country","State"],sort=False).first().reset_index()