I am working with pandas using a small dataset and I am stuck somewhere.
Here is the data after merging:
Using this data, the code below give the minimum area of each region and complete with the corresponding country name on the same line in the Dataframe obtained.
Area_min=Africa.groupby('Region').Area.agg([min])
Area_min['Country']=(Africa.loc[Africa.groupby('Region').Area.idxmin(), 'Names']).values
Area_min
And this one give the maximum population of each region and complete with the corresponding country name on the same line in Dataframe obtained.
Pop_max=Africa.groupby('Region').Population.agg([max])
Pop_max['Country']=(Africa.loc[Africa.groupby('Region').Population.idxmax(), 'Names']).values
Pop_max
Now I am trying to get the average population of each region and complete with the name of country having the population closest to the average of the corresponding group, on the same line in the Dataframe obtained.
The code below give the average population of each region but I am stuck on corresponding with the country name.
Pop_average=Africa.groupby('Region').Population.agg(['mean'])
I am thinking about .map() and .apply() function but I have tried without success. Any hint will be helpful.
CodePudding user response:
Since you're grouping by only one column, it's more efficient to do it once.
Also, since you're using idxmin
anyway, it seems it's redundant to do the first groupby.agg
, since you can directly access the column names.
g = Africa.groupby('Region')
Area_min = Africa.loc[g['Area'].idxmin(), ['Names', 'Area']]
Pop_max = Africa.loc[g['Population'].idxmax(), ['Names', 'Population']]
Then for your question, here's one approach. Transform the population mean
and find the difference between the mean and the population and find the location where the difference is the smallest using abs
groupby
idxmin
; then use the loc
accessor like above to get the desired outcome:
Pop_average = Africa.loc[((g['Population'].transform('mean') - Africa['Population']).abs()
.groupby(Africa['Region']).idxmin()),
['Names','Population']]