Home > Back-end >  Pandas Groupby with Range
Pandas Groupby with Range

Time:05-29

I have a data file which consists the name of cities, their ID code and the amount of salary in these countries and some more information. I wanted to make a table with using groupby method and groupby(coutry id and city) and find the mean salary value. I solve this problem like this:

file.groupby(['country_id',"city"])['salary'].mean()

This code is showing me the mean value of whole salaries. If I want to divide salaries into some ranges for example mean salary value in range (0,5000) and (5000,10000), what is the easiest method to do that? Is there any other way that writing 2 loops?

CodePudding user response:

The easiest way, in my approach, is to create an addition column salary_range. Then you can use groupby for 3 factors: country_id, city, and salaray_range which should give you the desired output

df.loc[df['salary'].between(0, 5000), 'salary_range'] = 1
df.loc[df['salary'].between(5000, 10000), 'salary_range'] = 2
# and so on ... 

df.groupby(['country_id','city','salary_range'])[['salary']].mean()

CodePudding user response:

You may use 'Binning' method to handle such problems.

df = df.groupby(['country_id',"city"])['salary'].mean().reset_index(name="mean")
bins = [0, 5000, 10000]
df['binned'] = pd.cut(df['mean'], bins)
print (df)
  • Related