I have a data frame, df
, like:
Year Month Country Organizer Participation
2020 1 China FAO True
2020 1 Japan FAO True
2020 1 France EU False
2020 2 France FAO False
2020 2 Japan FAO True
2020 2 Germany EU True
2020 2 Finland EU False
2020 2 India FAO True
2020 3 Senegal FAO True
and a dictionary, codes
:
codes = {
'Asia': ['China', 'Japan', 'India'],
'Europe': ['France', 'Germany', 'Finland'],
'Africa': ['Senegal']
}
My goal is to perform a group by considering the fields 'Year'
, 'Month'
, 'Organizer'
and the continent (in the df
I have 'Country'
), counting the number of rows. 'Participation'
is not important.
To be more precise my final output should be:
Year Month Organizer Continent Number
2020 1 FAO Asia 2
2020 1 EU Europe 1
2020 2 FAO Europe 1
2020 2 FAO Asia 2
2020 2 EU Europe 2
2020 3 FAO Africa 1
Is there a compact way to do this? The alternative way would be to add a column 'Continent'
and also use this column in the group by, but I'm afraid this is inefficient and inelegant solution.
CodePudding user response:
You can use:
new_codes={i:k for k,v in codes.items() for i in v} #{'China': 'Asia', 'Japan': 'Asia', 'India': 'Asia', 'France': 'Europe', 'Germany': 'Europe', 'Finland': 'Europe', 'Senegal': 'Africa'}
df['Continent'] = df['Country'].replace(new_codes)
dfx = df.groupby(['Year','Month','Organizer','Continent']).agg(number=('Continent',"count"))
Output:
Year Month Organizer Continent number
0 2020 1 EU Europe 1
1 2020 1 FAO Asia 2
2 2020 2 EU Europe 2
3 2020 2 FAO Asia 2
4 2020 2 FAO Europe 1
5 2020 3 FAO Africa 1
CodePudding user response:
Do a map using the dictionary to create a column 'Continent' and then do value_counts
.
out = df.assign(
Continent=df.Country.map({v: key for key, values in codes.items() for v in values})
).drop(
["Country", "Participation"], axis=1
).value_counts().to_frame().reset_index().rename(
{0: "Number"}, axis="columns"
)
print(out)
Year Month Organizer Continent Number
0 2020 1 FAO Asia 2
1 2020 2 EU Europe 2
2 2020 1 EU Europe 1
3 2020 2 FAO Asia 1
4 2020 2 FAO Europe 1
5 2020 3 FAO Africa 1